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Document Usage Guidelines 


e Should be used only for enrolled students 
e Not meant to be a self-paced document, an instructor is needed 


e Do not distribute 
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Course Prerequisites 


e Required: 
~ Splunk Enterprise System Administration Note m 
In order to receive credit for this 
- Splunk Enterprise Data Administration course, you must complete all lab 


exercises. 


e Strongly recommended: 
- Troubleshooting Splunk Enterprise 
- Architecting Splunk Enterprise Deployments 
- Working Linux knowledge 
- At least 3 months of hands-on Splunk administration experience 
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Course Goals 


¢|dentify factors affecting large-scale Splunk deployments 

e Set up Splunk indexer clusters 

e Deploy and configure a Splunk search head cluster 

e Add new nodes into an existing cluster 

e Decommission nodes from an existing cluster 

e Deploy apps and configuration bundles in Splunk clusters 

e Manage KV store collections and lookups in Splunk clusters 

¢ Monitor and identify clustering issues with Monitoring Console 


e Scale Splunk indexer cluster with SmartStore 
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Course Outline 


e Module 1: Large-scale Splunk Deployment Overview 

e Module 2: Single-site Indexer Cluster 

e Module 3: Multisite Indexer Cluster 

e Module 4: Indexer Cluster Management and Administration 
e Module 5: Forwarder Configuration 

e Module 6: Search Head Cluster (SHC) 

e Module 7: SHC Management and Administration 

e Module 8: KV Store Collection and Lookup Management 


e Module 9: SmartStore Implementation 
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Module 1: 
Large-scale Splunk 
Deployment Overview 
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Module Objectives 


e Review Splunk deployment options 

e |dentify factors that affect large-scale deployment design 
e Describe how Splunk can scale 

e Configure a Splunk License Manager 
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Review: Splunk Deployment Options 


e Key Splunk functions: 


~ Consumes data, stores indexed data, and searches indexed data 


e Splunk scales by P its functionality across k dedicated instances 





Deployment 
Server 
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Reference Servers for Distributed Deployment 
































OS Linux or Windows 64-bit distribution 
Network 1Gb Ethernet NIC (optional second NIC for a management network) 
Memory 12 GB RAM 
CPU Intel 64-bit chip architecture Intel 64-bit chip architecture 
12 CPU cores 16 CPU cores 
Running at 2+ GHz Running at 2+ GHz 
Disk Disk subsystem capable of 800 IOPS 2 x 10K RPM 300GB SAS drives - RAID 1 
http://docs.splunk.com/Documentation/Splunk/latest/Capacity/Referencehardware 





e Ratio of indexers to search heads depends on the number of concurrent 





users and the indexing volume per node sag T 
http://docs.splunk.com/Documentation/Splunk/latest/Capacity/ Taiere a renne | 
Summaryofperformancerecommendations Architecting Splunk Course 
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Introducing Splunk Clustering 


e Using commodity hardware, configure indexers to replicate indexes or 
group search heads to coordinate their search activities and loads 


e Allows you to balance growth, speed of recovery, and overall disk usage 


| | High Availability (HA) Disaster Recovery (DR) 

Indexing  Single-site cluster Multisite cluster 

Tier e Index replication e Can withstand entire site failure 
e Flexible replication policies e Supports active-passive and 


active-active configurations 
e SmartStore reduces the storage footprint while maintaining HA/DR 


search e Search head or search affinity (site-aware) 
Tier e Search head cluster e Search head or 
e Search head cluster 
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Splunk Server Roles in Splunk Clusters 


Monitoring 
Search Heads Console 
le 
c 
5 Deployer 
= 
Indexers 
Manager Node 


Server 


as 

= Deployment 
0 

G 


License Manager 
Load-balanced 


Forwarders 
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Splunk Server Roles in Splunk Clusters (cont. ) 


License Manager Allocates license capacity and manages license usage of all 
cluster members 


Manager Node Regulates the functioning of an indexer cluster 
Indexer (search peer) that participates in an indexer cluster 


Search Head Participates in clusters as stand-alone or search head cluster 
member 


= Server Centralized fase aden eee manager for forwarders 


Deployer st” Distributes Distributes configurations to search head cluster members to search head cluster members 


Monitoring Console Allows admins to monitor performance details regarding your 
Splunk environment 





e Splunk recommends that you dedicate a host for each role 
- You can enable multiple Splunk server roles on a server with caveats 
» You will learn more about caveats of each server role throughout this course 
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License Manager Configuration — 


s By default, every instance is a License Manager 
e All cluster members must share: 
- The same licensing pool 
- The same licensing configuration 
e Only incoming data counts against the license 
—Replicated data does not count 
e Cannot use a free license for clustering 
e You should forward license manager's internal logs to the indexing layer 
- Do not count against the license 
- Able to run searches against the license logs 
- Any unusual condition can be visible on all search heads 
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License Manager Configuration (cont. ) 


s To add a license to a License Manager, run: 
splunk add licenses {path to license file} 


e To check the list of license peer from the license manager, run: 
splunk list licenser-slaves 


e To switch to a License Peer, run this command on each peer: 
splunk edit licenser-localslave -master_uri https://{LM:Mport} 


e To check the peer configuration of a particular node, run: 
splunk list licenser-localslave 
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Lab Exercise 1 — Configure Splunk License Manager 


e Time: 15 - 20 minutes 


e Tasks: 
- Access your designated Splunk environment 
- Set up password-less SSH connection 
- Configure your License Manager instance 
- Log into Splunk Web and verify the license information 
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Lab Exercise 1 — Configure Splunk License Manager (cont.) 


Metal (10.0.x.1) 


wee ee ee ee ee ee ee ee ee ee ee ee ee m m m eB ee eB eB ee eB m ee m ee ee Be eH HK m 


co a a = 







Your 
Computer @ ssh you@Public_ DNS: 








ssh you@10.@.x.2 : 


8189 8289 8389 





isc-Server 
(10.0.x.3) 


X = Your student ID 
8?89 = splunkd-port 
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Lab Exercise 1 — Configure Splunk License Manager (cont. ) 


Indexers 


wee ee ee ee ee ee ee ee m m ee m m ee m m m m m eB ee eB eB ee eB m ee m m ee eB Ee eH HK m 


idx1 idx2 idx3 idx4 


a a ee ee ee ee ee 







http://{Public_DNS}/{splunk_server L 
For example, http: //{Public_DNS}/dserver 


Your 
Browser D 








E T 
Public_DNS = Same as your Misc-Server dserver 


splunk_server = Splunk server name —————— aana 
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Module 2: 
single-site Indexer Cluster 
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Module Objectives 


e Describe how Splunk single-site indexer clusters work 
¢|dentify cluster components and terms 

e [Implement a single-site indexer cluster 

e Search and review internal logs related to indexer clustering 
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single-site Indexer Cluster Overview 


E a 


e Manager node 
- There can only be one cluster manager 
- Controls and manages index replication 
— Distributes app bundles to peer nodes 
- Tells the search head which peers to search 


e Peer nodes 
— Index data from inputs/forwarders 
— Replicate data to other peer nodes as instructed 
by the manager 
e Search head 
— Required component of indexer cluster 
— Relies on the manager for its target search peers 
- Works the same as any Splunk search head 


e Forwarders 
— Send data to peer nodes 








penn nnn nnn nnn nnn nnn nnn nn nnn nnn nnn nnn nn nnn nnn nn nn nnn nnn nnn nnn nnn nnn nn nnn nnn nnn nnn nnn nnn nn nnn nnn nnn nnn nnn nn nnn nnn nnn nn nnn nnn nnn nnn nnn, 
E K 


Distributed | 
search i 


Forwarder with useACK enabled 
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Indexer Cluster Considerations 


Benefits Trade Offs 
e Data availability and fast recovery e Increased storage requirements 
e Easier overall administration e Increased processing load 
- Coordinated indexer configuration - Depending on the replication & search 
management factors 
- Automatic distributed search setup e Requires additional Splunk instances 
- Elastic indexer discovery ~ Minimum: 
- Indexer peer node status dashboard on Replication Factor (RF) + MN + SH 
me ene HEE — Recommended: 
e Scale-out Indexing capacity » More than RF + MN + more SHs 
e No additional cost for data replication e No support for heterogeneous indexers 


- Requires same OS and Splunk versions 


e Requires cluster-specific deployment 
management 
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Indexer Cluster System Requirements 


e Each node must run on its own host 


e The manager node must run the same or a later version than the peer nodes and 
search heads 
- Can run at most three minor versions later than the peer nodes 
» 8.1 manager node can run against 8.1 and 8.0 

e The search heads must run the same or a later version than the peer nodes 
e All peer nodes must run EXACTLY the same version 
e Peer node storage requirements: 

- Ability to sustain 800 IOPS for each peer node 

- The ratio of disks to disk controllers should mimic a database system requirement 


e Cluster recovery depends upon system resources available on manager node 


http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Systemrequirements 
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Indexer Cluster Deployment Overview 


1. Identify your clustering requirements 
- Replication policy, disk soace, number of peer nodes, etc. 


2. Install Splunk Enterprise and configure cluster instances 
- One manager, at least two indexers, and one search head 
- Synchronize the system clocks on all machines 


3. Enable clustering on each cluster instance 
- Use Splunk Web UI, CLI, or manually edit server. conf 


4. Create and distribute configuration bundles to the peer nodes 
5. Forward data to the peer nodes 
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Best Practices and Guidelines 


e Plan, plan, and plan 
- A single cluster or segregated clusters (by sourcetype, department, or use case) 


e Cluster instances should not share hardware 
- Dedicate hardware to the manager node, search head, and peer nodes 
- All members share the same license pool 
- Each peer node must have its own storage 


e Number of peer nodes is determined by: 
- Expected availability requirements of your organization 
- Level of replication required, daily data rate, retention policy, and concurrent users 
» Index replication does not increase your licensing usage 
e Cannot use a deployment server to distribute configuration bundles directly to 
peer nodes 


Generated for Chng Wei Min (wchng@micron.com) (C) Splunk Inc, not for distribution 


Splunk Cluster Administration 


| turn data into doing” 
SU u n k 8 24 Copyright © 2022 Splunk, Inc. All rights reserved | 25 February 2022 


Where to Install the Manager Node 


e On a dedicated host 
- Cannot be shared with a peer node or search head instance 


- Built-in search head for debugging purposes Note Ea 
When hosting additional roles, the cluster 
e Under certain limited circumstances, it can Se ee 
fulfill additional server roles Eo eee 
e 10 search heads 
- License manager 
l l Note KS 
B Monitoring Console Manager node failover is discussed later in 
this module. 


- Deployer 


e DO NOT co-locate a deployment server on the manager node under 
any circumstances 
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single-site Cluster: Key Specifications 


e Peer nodes copy buckets to other peer nodes (index replication) 
- The copied buckets may be searchable buckets or contain only rawdata 


e Replication factor 
- Specifies how many total copies of rawdata the cluster should maintain 
— Sets the total failure tolerance level 


e Search factor 
- Specifies how many copies are searchable 
» A searchable bucket contains both rawdata and index files 
» Its rawdata is counted as a part of the replication factor 
- Cannot be larger than the replication factor 
- Determines how quickly you can recover the search capability 
» A trade-off between disk usage and search availability 


e Security key (pass4SymmKey) 
~ Authenticates communication between the cluster nodes 


- The key must be the same across all cluster instances 
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Estimating Disk Usage 


e For this example, assume: Note a 
s i Per day storage must still be 
ka Daily index data = ~1 OOGB ale by aren retention! 


—rawdata on disk = ~15% of daily index data 
-index files on disk = ~35% of daily index data 


Daily Index data = ~1@0GB RF=3 & SF=2 RF=3 & SF=2 RF=3 & SF=3 
on 3 peer nodes on 6 peer nodes on 6 peer nodes 


rawdata ( /rawdata(~15GB/day) 15 | 15 * 3 = 45 GB 3 = 45 | 15 * 3 = 45 GB 15 | 15 * 3 = 45 GB 3 = 45 | 15 * 3 = 45 GB 15 | 15 * 3 = 45 GB 3 = 45 | 15 * 3 = 45 GB 


index files (~35GB/day) 35 * 2 = 70 GB 35 * 2 = 70 GB | 35 * 3 = 105 GB 
Total size across cluster 115 GB 115 GB 150 GB 
Per Peer storage / day 115 / 3 = 38.3 GB | 115 / 6 = 19 GB | 150 / 6 = 25 GB 
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Configuring Splunk Cluster 


e There are three ways to configure 
- Splunk Web, CLI, and server. conf 
-|n this course, you will use CLI for configurations and Splunk Web for 
monitoring 


e Enable clustering on the instances in the order of 
Manager node > Peer nodes > Search heads 
e Get help on Splunk cluster commands: 
-Splunk help cluster 
-Splunk help [list|edit] cluster-config 


Generated for Chng Wei Min (wchng@micron.com) (C) Splunk Inc, not for distribution 
Splunk Cluster Administration 


S lunk turn data into doing” 28 | | 
Copyright © 2022 Splunk, Inc. All rights reserved | 25 February 2022 


Ports for Indexer Clustering 


it 


o Manager Node 






Note G 


To participate in this indexer 
cluster, all nodes -- including the 
search head -- must use the same 
Forwarder pass4SymmnKey. 


---- Management (splunkd port) 
——— Replication (replication port) 


— Data (receiving port) 
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Configuring Splunk Manager Node 








Enter the single command S Splunk defaults to: 

> splunk edit cluster-config -replication_factor = 3 

-mode mane -search_factor = 2 

-replication_ factor 

earch factor e The secret parameter is 

Seacret encrypted and saved as 
pass4SymmKey 

Results in: y 7 Required 

SPLUNK_HOME/etc/system/local/server.conf - mycluster is the password for 

oe cane | this cluster example 

mode = manager 

REOILCEELON Acros Nols = 

pass4SymmKey = Hashed Secret Splunk 7.2+ encrypts all new secrets using a new cipher. 





However, it will not auto-migrate existing hashes. 


To see the decrypted pass4SymmKey, run: You can update them by changing the splunk. secret file on 


, l each instance to use the new cipher. 
splunk show-decrypted --value ‘<hashed_secret> To change, run: splunk rotate splunk-secret 
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Configuring the Peer Nodes 


> splunk enable listen e Ports required on each peer: 

> splunk edit cluster-config o. 

“mode - Receiving port to listen to forwarders 
-master_uri https:// - Replication port to communicate with 


-secret mycluster other peer nodes 
-replication_port 


> 


SPLUNK_HOME/etc/system/local/server. conf 


e ln this example: 
- 9997 = the forwarder listening port 





ners -10.0.1.3 = manager node address 


voce = elave - 8089 = manager node's splunkd-port 
aoter Mish hee ae E E o 


pass4SymmKey = Hashed_Secret —~mycluster = same cluster password 





[replication port://9100] — 9100 = index replication port 
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Configuring the Search Head 


> splunk edit cluster-config e To configure as a cluster search head: 


-mode -edit cluster-config 
-master_uri https://10.0.1.3:8089 


-secret mycluster e Functions as a regular search heaq 





e For more help: 
SPLUNK_HOME/etc/system/local/server. conf splunk help [list|add|edit|remove] 
cluster-master 






[clustering] 
mode = searchhead 

Melee E aaa HRE os 60.oo 
Pas esa MNS) Secs oC Nace acm. 












) (C) Splunk Inc, not for distribution 





ao Splunk Cluster Administration 
sp un turn data into doing 32 
Copyright © 2022 Splunk, Inc. All rights reserved _ | 25 February 2022 


Adding SH to an Additional Indexer Cluster 


> splunk add cluster-master e SHs can belong to multiple clusters 
-master_uri https://20.0.2.6:8089 


e To allow SH to search additional clusters: 


-secret yourCluster 
splunk add cluster-master 


v 








Ouste ine | 
mode = searchhead 
ma cer Uier = Chisceimmestocs LO R clusteniesteics 200,256: 















[elhasiceiamne E > 10O 0. IL Se ollie S] 
mac ter Wien eE 7 HRE lo ses? 
OES e a Seereene I 


ke 


E == 


RER HIE siveie L B E E 0:3029] 
U Wien e IT 
TE = le slnecl Sociceic 2 





SPLUNK_HOME/etc/system/local/server.conf 
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Generations 


e Identifies which copies of a cluster's buckets are primary 
- These buckets will participate in search 
- Non-searchable buckets will lack .tsidx files, metadata, etc. 


e Changes over time, as peers leave and join the cluster 
- Cluster rebalancing 
- Primary reassignment after a peer goes down 
e How cluster nodes use the generation 
- CM creates each new generation and assigns an ID 
- Peers keep track of which bucket copies are primary in the generation 
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Clustered Buckets 


e Buckets are located in: $SPLUNK_HOME/var/1ib/splunk 


e Naming convention for clustered buckets also distinguishes types of 
copies: originating vs replicated 


Non-clustered hot_v1_<localid> db <newest_time> <oldest_ time> <localid> 
Clustered originating hot_vi_<localid> db <newest_time> <oldest_time> <localid> <guid> 
Clustered replicated <localid> <guid> rb <newest_time> <oldest_time> <localid> <guid> 


—newest_time & oldest_time = timestamps indicating the age of data in the 


bucket 
—localid = an ID for the bucket TEE sa 
, l l The guid is located in the peer's: 
- guid = the guid of the source indexer ELUNK OE Ana eeg 
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How the Cluster Manager Sees Buckets 


e Buckets are identified using three pieces of information: 
e Index name, local ID, GUID of original indexer 
e Buckets can appear in two different ways: name vs separate field 
_audit~3~1318C26A-D9FE-45F@-AF51-7A8038F5419C 
VS 
index: _audit, bucket id: 3 1318C26A-D9FE-45F0-AF51-7A8038F5419C 


e Searchable buckets may be flagged as "primary" 
-REST API: /services/cluster/master/buckets 
- Returns: 0x000000000 (non-primary) or Oxffffffff (primary) 
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Manager Dashboard — Single-site Cluster 


Settings > Indexer clustering 








Indexer Clustering: Master Node 
Edit v More Info » Documentation LZ 
v All Data is Searchable ZV Search Factor is Met Y Replication Factor is Met 
3 searchable (0) not searchable 2 searchable (0) not searchable 
Peers Indexes 
Peers (3) Indexes (2) Search Heads (2) 
filter Q 10 per page ¥ 
i Peer Name ^ Fully Searchable + Status > Buckets > ? 
v idx1 v Yes Up 6 
LOCRUON sinaia 10.011:8189 
Last Heartbeat ........ 3/5/2018, 5:17:00 PM 
Replication Port ...... 9100 
Base Generation ID 8 
GHD DE5BC497-7D49-44A2-9092-0E068697DE70 
> idx2 v Yes Up 11 
> idx3 v Yes Up T 
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Index Replication Health 


Complete Met as specified Met as specified 
Valid One or greater 
searchable Have two searchable copies of each bucket (Searchable rolling-restart) 


e With replication factor = 3 and search factor = 2: 
- A complete cluster has 3 copies of each bucket, 2 of which are searchable 
- A valid cluster has at least one searchable copy of all buckets 
e With replication factor = 2, search factor = 2, and 2 peers: 
- A complete cluster has 2 searchable buckets, each having its copy of rawdata 
- A valid cluster has at least one searchable copy of all buckets 
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Factors in Action — Data Replication 















‘Legend 
Index Cluster [Beernodes [=4] | 
Search i | Í Manager POST MRIS | Primary 
Head Node Replication factor (RF) | (origin 
a [Search factor (SF) [=2| | 
A @ Saar pages search factor (SF) 
l Searchable 
backups 
Forwarder T rawdata | 
Rawdata 
Bucket 1 only 
Bucket 2 
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Factors in Action — Search 
Complete & Valid 





See ee ae 
Po ss. 


Index Cluster Legend 






Search 
Head 


Manager Primary 
(origin 


TT ) Peer nodes 





Searchable 
backups 


Cs | TSIDX 


Forwarder 





-- LTE E a So or ooo Soo or SSS oO Soa oS oe ior SS SSO ae ooo eto eae SOS Se Sao Se a R T Se T a SoS eS Sas ee Se Sooo 


Rawdata 
only 


Bucket 1 


rawdata 
Bucket 2 
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Factors In Action — Initial Primary Loss 


Valid but Not 












Complete 

S Legend 

iain 

Search S my 
Head Replication factor (RF) (origin) 
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backups 
Forwarder | 
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Factors in Action — Initial Primary Loss (cont. ) 
Complete & Valid 






á \ Legend 

Search Primary 
Head ` | Replication factor (RF) | (origin) 

: Search factor (SF) | | 





Searchable 
backups 


TSIDX 


| | | Rawdata 
Bucket 1 | EN | 


only 
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Forwarder 


Bucket 2 





aE 


Factors in Action — Second Peer Loss 






Valid but Not In the scenario of losing two peers, searchable 
C om plete backups are promoted to a primary copies 
\ Legend 
S nai 
earc ce | Ne 
Head | Replication factor (RF) | (origin) 
| Search factor (SF) | a 
| Searchable 
backups 
Forwarder 
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Factors in Action — Second Peer Loss (cont. ) 


Valid but Not 
Complete 
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Factors in Action — Rebalancing 


Complete & Valid 
After Rebalancing 





a 
e ss. 








Legend 
Index Cluster 
Search i i Manager ee ey 
head Node Replication factor (RF) (origin) 
a E Search factor =e) 
i | | Searchable 
backups 
Forwarder 
only 
k TSIDX | [ rawdata | 
| | Excess 
Bucket 2 
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Excess Buckets 


Indexer Clustering: Master Node 


Edit v More Info œ Documentation LZ 


v All Data is Searchable vV Search Factor is Met Y Replication Factor is Met 


3 searchable 0 not searchable 2 searchable 0 not searchable 
Peers Indexes 


Lists excess buck 
Peers (3) KT am Heads (2) S ets 


Bucket Status filter Q 10 per page » 


Index Name + Fully Searchable + Searchable Data Copies $ Replicated Data Copies $ Buckets > ? Cumulative Raw Data Size $ 
_audit v Yes 2 5 < 0.01 GB 


_internal v Yes 2 6 < 0.01 GB 





Generated for Chng Wei Min (wchn micron.com) (C) Splunk Inc, not for distribution 





o l Ea Splunk Cluster Administration 
SU UNK > turn data into doing 46 | | 
Copyright © 2022 Splunk, Inc. All rights reserved | 25 February 2022 


Notable Indexer Cluster Log Channels 


e Cluster peers communicate via the /services/cluster endpoints 
- Peer to manager communication: /services/cluster/master 
- Manager to peer communication: /services/cluster/slave 

e splunkd_access.log — Indexer cluster communication logs 


- Example: index=_internal sourcetype=splunkd_access 


(uri="/services/cluster/slave/buckets*" OR uri="/services/cluster/master/buckets*" ) 
| convert ctime(_time) | table _time uri_path | sort _time 





- Higher response time indicates service overloading 
- Response status 200 is good, anything else is not good 


e splunkd.log — indexer clustering activity logs 
— component=CM* OR component=Cluster* 
— Look for WARN/ERROR on the host=<master> 


e metrics.log 
- component=Metrics group=clusterout_connections 
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Manager Node Failover 


s |f the manager node is lost, the cluster continues to operate 
- New data arriving at peer nodes is indexed, but might not replicate 


- The search heads continue to send the queries to last Known list of peer 
nodes and peer nodes respond if they can 


- After the manager node comes back online, buckets are re-balanced 
e A stand-by manager node can be configured 
— httos://docs.splunk.com/Documentation/Splunk/latest/Indexer/Handlemanagernodefailure 


e You can employ DNS-based failover, a load balancer, or some other 
technique to switch the same master_uri to the stand-by manager 


—Hot-standby is NOT recommended 
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Manager Node Failover (cont.) 


e A standby manager only needs the primary manager's static state 
information 


—~SPLUNK_HOME/etc/system/local/server.conf 
—~SPLUNK_HOME/etc/master-apps 


e When the standby manager node starts, its services are blocked until 
it can fulfill the replication factor 


- [o have the standby manager unblock immediately, run: 


Splunk set indexing-ready = 


Extra provisioning is required if the 
Monitoring Console is also enabled 
on the Manager Node. 
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Restarting Indexer Cluster 


e Ordinarily you do not restart the entire cluster 
- Search heads can be restarted at any time 


e |f you do need to restart the entire cluster: 
1. Restart the manager node with splunk restart 
2. Run preliminary health checks 


» Must be in a searchable state Edit More nio: | Doc 
» Check the manager dashboard for cluster status [NodeType OOOO O 


| or 
Master Node Configuration 
» CLI: splunk show cluster-status --verbose Wn 
Configuration Bundle Actions 
3. Restart peer nodes: Data Rebalance 
» Select Edit > Rolling Restart g inca 


Disable Indexer Clustering 





>» CLI: splunk rolling-restart cluster-peers 
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Peer Nodes Rolling Restart 


e Performs a phased restart of all the peer nodes 
- Restarts 10% of the peers at a time in random order (configurable) 


Index Cluster Rolling Restart x 


Are you sure you want to initiate a rolling restart? This action puts the 
cluster into maintenance mode. Learn more. [4 


Searchable <—— Indexer Cluster Searchable Rolling Restart option (More details in Module 4) 


Restart peers with minimal search Interruption. 


Peer percent 10 % < splunk edit cluster-config -percent_peers_to_restart 100 


Specify percentage of peers to restart, default Is 10. 


Cancel Begin Rolling Restart splunk rolling-restart cluster-peers 
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Migrating Non-clustered Indexers to a Cluster 


e You can add a non-clustered indexer to a cluster as a peer node at 
any time 
y : . Note G 
splunk edit cluster-config -mode peer ... You cannot convert a peer node to 


a non-clustered indexer. 


e Apps must be re-distriouted via master-apps 
- You learn about distributing apps in indexer clusters in Module 4 


e Only new data coming into this peer is replicated 
- New data follows the cluster's replication factor 


e Existing buckets are not replicated 


- Contact Splunk Professional Services if you must replicate legacy 
buckets 
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Upgrading and Applying Maintenance Releases 


http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Upgradeacluster 








Can perform online rolling update Note EA 
1. Update manager node first, then search heads e Manager node must run the highest 
Maintenance 2. Put the manager into maintenance mode version 


Update? e Discussed in module 4 ¢ Search head must run higher 
3. Upgrade the peer nodes version than the peer nodes 
N e Use ne splunk stop command e Complete the entire process quickly 
l 0 
Upgrade 


from 7.1+? 


Can perform a searchable rolling upgrade: 
e No service interruption during the upgrade 
e More discussions in module 3 


Note lx 


e Must upgrade the tiers in the 


Perform a non-rolling upgrade: 
e Must stop the cluster before Upgrade Upgrade Upgrade prescribed order 


manager node search heads peer nodes | ° Within each tier, upgrade all nodes 


you can upgrade as a single operation 
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Lab Exercise 2 — Enable single-site Cluster 


Time: 25 - 30 minutes 


Tasks: 
— Switch all cluster members to license peers 
- Configure the manager node for a single-site indexer cluster 
— Configure three indexers to form the replication peers 
— Configure a search head to join the cluster 
— Monitor the cluster status with Splunk Web 


Optional Tasks: (additional 15 minutes) 
— Test a peer node failover scenario 


- Investigate the peer outage with Splunk internal logs 
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Lab Exercise 2 — Enable Single-site Cluster (cont.) 


Indexer Cluster (10.0.x.1) 


cmanager 
8089 






8289 8389 





dserver 


SSH you@PubLic_DNS =| G 





SSH you@10.@.x. 





Ge ee ee ee ee ee ee 


Misc-Server 
(10.0.x.3) 


X = Your student ID 
8?89 = splunkd-port 
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Module 3: 
Multisite Indexer Cluster 
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Module Objectives 


e Describe how Splunk multisite indexer clusters work 
e Identify multisite terms 

e Implement a multisite indexer cluster 

e Describe optional configuration settings 
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Key Benefits of Multisite Indexer Cluster 


e Allows for an extra layer of data partitioning - 
— Indexers are grouped by “sites” 
- A site is logical grouping of Splunk instances 


e Multisite clusters offer two key benefits: 
1. Disaster recovery 


» Stores index copies at multiple sites 
(i.e. geo-location or rack) 


» Provides automatic site-failover capability 


» In case of a disaster, indexing and __ 
searching continue on the surviving sites 


2. search affinity 
» Preferentially searches assigned site 
» Greatly reduces WAN network traffic 







HQ (site1) 


JP (site3) 


e SSS SS SS ee ee ee ee ee ee ee eee ee ee ee e 
"1 s 
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Multisite Cluster Key Attributes 





multisite Enables multisite clustering 
site - A logical group that shares clustering policies 
- Also the site where the manager node resides 
available sites - Defines the sites in the cluster 
- Supports up to 63 
Site_replication_factor Controls how to distribute raw copies of data among the sites 
Site search factor Controls how to distribute searchable copies 


-site replication factor 
—-Site search factor 
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site Replication Factor Examples 


origin:2, total:3 Default. Put the extra copy on a site that 
doesn't have a copy 

origin:1, total:4 (where there are 4 sites) Try to put a copy on any site that doesn't 
have one 

origin:2, site1:2, total:5 Both site1 and origin have a minimum of 
2 copies 

origin:2, site1:1, total:4 If origin happens to be site1, then the 
higher value takes precedence 

origin:2, site1:2, site2:2, total:3 Invalid 

Note Ka 


e You must specify the origin 
e A factor count must be greater 
than O 
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Multisite Indexer Cluster Deployment 


1. Determine multisite cluster use cases and requirements 


2. Install Splunk Enterprise and configure cluster instances 
- One manager node 
- At least two peer nodes per site 
- One or more search heads per site (recommended) 

3. Enable clustering on the instances in the order of 
Manager Node > Peer Nodes > Search Heads 
- Splunk CLI, or manually edit server. conf 


4. Create and distribute the configuration bundle to the peer nodes 
5. Configure forwarders to send data to the peer nodes 
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Multisite Cluster Topology 


Manager 
Node 





aS SS eee a P a m 
mn m mn m n m m m n m mn n mn mn mn n n mn m m n mn n n m mn m n m mn mn m o 


L 


---- Management (splunkd port) 
——— Replication (replication port) 
——— Data (receiving port) Forwarder 
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Configuring the Multisite Manager Node 


Splunk edit cluster-config -mode manager true Sitel 
Site1,site2 origin:1,total:2 
origin:1,total:2 -secret mycluster 


SPLUNK_HOME/etc/system/local/server.conf 


[general] 
Site = sitel 


[els tering | 

multisite = true 

mode = manager 

available sites = sitel,sitez 

Site replication factor = origin:1,total:2 
site search factor = origin:1,total:2 
Paseo VINMNS ji eens Nee tac 6 mer 
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Configuring Multisite Cluster Peer Nodes 


Splunk edit cluster-config -master_uri https://10.0.1.3:8089 
-replication_ port 9100 -secret mycluster 


-mode peer -site 


Splunk edit cluster-config -master_uri https://10.0.1.3:8089 
-replication port 9100 -secret mycluster 


-mode peer -site 





Peer1&2 server.conf 


[general] 
site = sitel 


[clustering] 

mode = slave 

MMe Siseie Wires Incigose/ / Ose. 5 3s stie’ 
Paces o INNS Ve Seniesa wace mee 
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Peer3&4 


Peer3&4 server.conf 


[general] 
site = site2 


[clustering] 

mode = slave 
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Configuring Multisite Cluster Search Heads 


e Search heads can: 
- Join the indexer cluster at any time 
- Participate in multiple indexer clusters 
- Combine searches across clustered and non-clustered search peers 


e Steps to configure a SH to search multiple indexer clusters: 
-|f not already a cluster search head, enable it first: 
splunk edit cluster-config -mode searchhead ... 
- [o add the search head to another indexer cluster, run: 
Splunk add cluster-master <master_uri:port> ... 
- |f you need to change the clustering configuration or attributes: 
splunk edit cluster-master <master_uri:port> ... 
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Configuring a New Multisite Search Head 


Enable a new instance (SH2) as a cluster search head 
-mode searchhead -master_uri https://10.0.1.3:8089 -site site2 


splunk 
-secret mycluster 


SH2 server.conf 


[general] 


site = site2 


[clustering] 


ET E R S E less ss 2 
mode = searchhead 

multisite = true 

SES SSS Mey EE SeOIee c 
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Configuring an Existing SH to Multisite 


e Enable the converted SH1 to search an additional single-site cluster 
https://20.0.2.6:8089 -secret 2ndCluster 





e Convert existing single-site cluster search head (SH1) to multisite mode 
https://10.0.1.3:8089 -secret mycluster 





@ lelustering] 
Mao eee a a a e E H H E T Er a T 
mode = searchhead 


@ leltistermacster: 10.0.1. 3.8089) 
maste sisi iene s +) / On Oriles 36089 


PEGE GRS = eiuS 
Dass So MMe) Sea T SH1 server.conf 
Sis = si cell 


@ [clustermaster:20.0.2.6:8089] 
Mee lies — Mmedose)/ / 20052 se xsOe’ 
multisite — false 
pasc 1oymmkey = Hashed occror2 
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Manager Node View — Multisite Cluster 


Indexer Clustering: Master Node 





Edit v More Info v Documentation 2 
v All Data is Searchable Y Search Factor is Met Z Replication Factor is Met 
4 searchable 0 not searchable 2 searchable (0) not searchable 
Peers Indexes 
Peers (4) Indexes (2) Search Heads (3) 
filter Q 10 per page œ 
i Peer Name ^ Site Z Fully Searchable + Status > Buckets > ? 
> idx1 site1 v Yes Up 18 
> idx2 site1 v Yes Up 16 
> idx3 site2 v Yes Up 25 
v idx4 site2 Z Yes Up 10 
Location... ~se 10.011:8489 
Last Heartbeat ........ 3/6/2018, 1:57:52 PM 
Replication Port ...... 9400 
Base Generation ID 8 
UD E DAA8BBOA-8697-4121-ASE7-5E3DC43F160C 
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search Affinity 


e |n single-site mode, there is only one set of “primary” searchable 
buckets that respond to searches 


e With multisite, each site can have searchable replicas that respond to 
searches 


e Search affinity (enabled by default) 
- Search heads have a site association 


- Searches get as many events as they can from the same site 


» If a searchable bucket exists on the site, it will be the primary bucket for 
that site 


> Searches will extend across sites only when they are needed 
- Limit the access of each user to only their local search heads 
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Multisite Factors in Action 


a S 


ae 
SS, 


Site replication factor 
origin: 2 

total: 3 

Site search factor 
origin: 1 

total: 2 


Complete & Valid 





o 
Origin (O) 


rawdata 
Searchable Backup (B) 





rawdata | Rawdata Only (R) \ Ñ IW T 
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Multisite Factors in Action — Primary Loss 


site replication factor 
origin: 2 
total: 3 


Site search factor 
origin: 1 
total: 2 


Valid But Not Complete 
Origin:2 is not possible 


i 
Origin (O) 


rawdata 
Searchable Backup (B) 


| rawdata | Rawdata Only (R) 


Generated for Chn 








l k l L Splunk Cluster Administration 
sp U Nn BE Sean solt nm Copyright © 2022 Splunk, Inc. All rights reserved _ | 25 February 2022 


Multisite Factors in Action — Site Loss 


Site replication factor 
origin: 2 

total: 3 

Site search factor 
origin: 1 

total: 2 


Valid But Not Complete 





i 
Origin (O) 


rawdata 
Searchable Backup (B) 


rawdata |Rawdata Only (R) 
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Example: site1 Down 





More Info ¥ Documentation LZ 


Indexer Clustering: Master Node Edit v 


A Some Data is Not Searchable A Search Factor is Not Met A Replication Factor is Not Met 


2 searchable 2 not searchable 1 searchable 1 not searchable 





Peers Indexes 

Peers (4) Indexes (2) Search Heads (3) 

filter Q 10 per page v 
i Peer Name ^ Site $ Fully Searchable $ Status $ Buckets > ? 
> idx! site A No Down o 
> idx2 site A No Pending 22 
> idx3 site2 v Yes Up 27 
> idx4 site2 v Yes Up 10 











v All Data is Searchable A Search Factor is Not Met A Replication Factor is Not Met 


2 searchable (0) not searchable 


2 not searchable 2 searchable 





Peers Indexes 

Peers (4) Indexes (2) Search Heads (3) 

filter Q 10 per page v 
i Peer Name ^ Site $ Fully Searchable $ Status $ Buckets > ? 
> idx! site A No Down 0 
> idx2 site A No Down 0 
> idx3 site2 v Yes Up 29 
> idx4 site2 v Yes Up 12 


e Pending 
- A replication failed 
— Transitions to the 
next flag based on 


the subsequent 
heartbeat check 


e Down 


— The peer went 
offline for some 
unknown reason 


istribution 
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Disabling Search Affinity 


e You can disable search affinity for overall search performance 
- Spread the search request across indexers on all sites 
—Will increase WAN traffic 
s To disable search affinity, edit the search head configuration 
- IMPORTANT: all sites must be in close proximity with very low network 


latency 
splunk edit cluster-master https://10.0.55.3:8089 








[elustermeaster: 10.02.5523: T DEE 


Wester Wien = 
meres s// 10,0. 55-3580 9% 
multisite = true 

QELS SA Shey = TE 


Site = site0 SH2 server.conf (Search affinity disabled) 
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Manager Node Failover 


e Preparing for manager node failover is the same for both single site and 
multisite clustering 
e When a manager restarts, it blocks indexing until enough peers join each site 
to fulfill the replication factor 
e A cluster can be in a state where it CANNOT fulfill the replication factor 
- You need to restart the manager while a site is down 
origin:1, site1:1, site2:1, site3:1, total:4 
— The site including the manager goes down and a stand-by manager starts up 
on another site 
e To unblock cluster services, run splunk set indexing-ready every time 
you restart the manager node 
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Restarting a Multisite Indexer Cluster 


e Again, ordinarily you do not restart the 
entire cluster 


- Search heads can be restarted at any 
time 
e |f you must restart the entire cluster: 
1. Restart the manager node 
2. Run preliminary health checks 


» Check the manager dashboard for 
cluster status 


3. Begin rolling restart 


» The rolling restart in multisite proceeds 
with site awareness 


» Can set a specific site order 


splunk rolling-restart cluster-peers 
-Site-by-site true -site-order site2,site1 





Index Cluster Rolling Restart 


Are you sure you want to initiate a rolling restart? This action puts the 
cluster into maintenance mode. Learn more. Z 


Searchable 


Restart peers with minimal search interruption, 


% Always calculated globally 


Peer percent 10 
Specify percentage of peers to restart, default is 10. 


Site Order w 


Cancel Begin Rolling Restart 
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Migrating from Single-site to Multisite 


e Make sure all cluster nodes are running the same Splunk Enterprise version 


1. Change the manager node to multisite mode and restart 
- DO NOT remove the existing single-site replication factor and search factor 
- New multisite factors must be at least as large as the single-site factors 

2. Enable maintenance mode on the manager: 
splunk enable maintenance-mode 

3. Change peer nodes to multisite mode with a site association and restart 
- DO NOT restart a peer that hasn't been converted 

4. Change search heads to multisite mode and restart 


splunk edit cluster-master https://CMaster:8089 -multisite true -site 
site1 -secret mycluster 


5. Disable maintenance mode on the manager: 
splunk disable maintenance-mode 
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Indexer Cluster Maintenance Mode 


e Invoke maintenance-mode when performing any work on a peer node that may 
cause excessive bucket status changes 
> splunk [enable|disable|show] maintenance-mode 
- Bucket fixup discontinues copying buckets between peers to maintain complete state 
- Only reassigns primaries to maintain a valid state 
- Peer outage while in maintenance mode may return incomplete results 
- Maintenance mode persists across manager restarts 
- No notion of sites; works for both indexer cluster configurations 
- Bucket rolling that would occur due to peer outage is discontinued 
- After disabling maintenance mode, manager node catches up on replication policies 


¢splunk apply cluster-bundle and splunk rolling-restart 
automatically invoke maintenance mode 
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Migration Notes 


e Multisite policies apply to new data only 
e Existing non-clustered buckets will not replicate; they just age out 
e Existing single-site buckets follow the existing policies until they age out by 
default 
- Do not remove the existing single-site replication attributes 
- Multisite total values must be larger than the single-site factors 
» Must reduce the single-site factors to match the least number of peers on any 
Site 
e Set constrain singlesite buckets=false in manager's server.conf to 
have buckets replicate across sites 
- Follows the specified site replication factor and 
Site search factor 
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Indexer Cluster Searchable Rolling Upgrade 


e Splunk 7.1+ supports searchable rolling upgrade Note G 
. - ' l Health checks are not all inclusive. 
— Minimal service interruption during the upgrade SGT aap 


- All nodes must be running version /.1 or later criteria. 


e For versions prior to /.1, see: 
httos://docs.splunk.com/Documentation/Splunk/7/.0.0/Indexer/Upgradeacluster 


e Maintenance releases should be applied in the same order 


B T 


o J Manager Node fo J] All Peer Nodes fs J] Manager Node 





. Run preliminary health checks: . Take the peer offline: 8. Finalize searchable rolling upgrade 
splunk show cluster-status --verbose splunk offline splunk upgrade-finalize cluster-peers 
. Upgrade the manager node . Upgrade the peer node 


. Initialize searchable rolling upgrade . Bring the peer online: 
splunk upgrade-init cluster-peers splunk start 
. Repeat on all peer nodes 
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Further Reading: Clustering 


e Basic clustering concepts for advanced users 

— http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Basicconcepts 
e Manager site failover 

— httos://docs.splunk.com/Documentation/Splunk/latest/Indexer/Handlemanagernodefailure 
e Configure the search head 

- http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Configurethesearchhead 
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Lab Exercise 3 — Migrate to Multisite Cluster 


e Time: 20 - 25 minutes 


e Tasks: 
- Migrate the single-site manager node to the multisite mode 
- Migrate the existing peer nodes and add a new peer 
- [o configure site2, convert IDX3 and add a new peer IDX4. 
- Convert SH1 to site1 search head and add SH2 to site2 


e Optional Tasks: (additional 15 minutes) 
— Test the indexer site failover scenario 
» Stop site1 processes 
» Check the cluster status and verify the search affinity 
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Lab Exercise 3 — Migrate to Multisite Cluster (cont. ) 


Indexer Cluster (10.0.x.1) 


cmanager 
8089 






SSH you@10.@.x.1 


dserver 
8189 





SSH you@Public_DNS >| = 





SSH you@10.@.x.2 





Ge ee ee ee ee ee m m ee 


Misc-Server 
(10.0.x.3) 


X = Your student ID 
8?89 = splunkd-port 


Generated for Chng Wei Min (wchn micron.com) (C) Splunk Inc, not for distribution 





l ao Splunk Cluster Administration 
sp UNK > turn data into doing 83 | | 
Copyright © 2022 Splunk, Inc. All rights reserved _ | 25 February 2022 


Lab Exercise 3 — Migrate to Multisite Cluster (cont. ) 


Indexer Cluster 







http: //{Public_DNS}/{splunk_server} 
Browser F . 
or example: 


http://{Public_DNS}/cmanager 
http://{Public_DNS}/sh1 


Public_IP = Same as your Misc-Server ! 
splunk_server = Splunk server name i 
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Module Objectives 


e Enable replication for custom indexes 

e Deploy common apps and configurations to peer nodes 

e Take a peer offline temporarily 

¢Decommission a peer permanently 

e Clean up excess cluster buckets 

e Optimize peer node storage utilization 

e Configure Monitoring Console for indexer cluster environment 
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Configuring Clustered Indexes: repFactor 


e The repFactor attribute in 
indexes.conf configures the index 
to participate in the cluster or not 

—repFactor = auto (replicate) 
—repFactor = @ (donot replicate) 


s All peer nodes must use the same set 
of indexes.conf files 
- DO NOT use the Splunk UI to configure 
index settings in a cluster 


- Deploy from the manager node 


e Default is @ (do not replicate), however 
the internal indexes are set to 
replicate automatically 
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master-apps/_cluster/default/indexes.conf 


[main] 
EEE GEO 


[SR e 
EEE SGE OE 


[summary ] 
Te) RT SGE E 


L internal] 
RSS) RT SGE E 


L Spe) 
JSON VS EOI = AUTO 


LER LS HUGUET ken 
EGE SUGOEOE = AULO 


eE T 

homePath = SSPLUNK DB/ telemetry/db 
coldPath = $SPLUNK DB/ telemetry/colddb 
EITE SE 


SSPLUNK DB/ telemetry/thaweddb 
BEPHEAIGCTO NE SUD 
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Master-apps Deployment Tool 


e The manager node distributes configuration bundles (apps) to the peer nodes 


— Supports common configurations for inputs, parsing, and indexing 
(inputs.conf, props.conf, transforms.conf, indexes.conf) 


e Stage deployment bundles in the manager's SPLUNK_HOME/etc/master-apps directory 
- For common apps, copy them to the <app-name> subdirectory 
- For standalone files, copy them to the _cluster/local subdirectory 


e If necessary, manager node initiates an automatic rolling restart of all peers 


Manager Node: Settings > Indexer clustering 


Edit + More Info ¥ 


Node Type 


Configuration Bundle Actions 


Data Rebalance 


Rolling Restart 


Disable Indexer Clustering 
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Configuration Bundle Status and Actions 


Configuration Bundle Actions 
Click Push to distribute the configuration bundle to the set of peers. Optionally, validate the bundle and check if peer restart is r 


Commit: splunk apply cluster-bundle [--skip-validation ] 


Validate: splunk validate cluster-bundle [--check-restart ] 


x Back to Master Node 








Validate and Check Restart 


Last Validate and Check Restart: ~ Successful 


Restart 7"... NOt Required S DO NOT edit the 


Updated Time esse vesse 2/26/2018, 10:42:37 AM 


Active Bundle ID ? „suss... 2CF2DBDFOCBCCEDIBEA21EDASC92CD7A Slave-a DDS 
Latest Bundle ID ? „sss: 2CF2DBDFOCBCCEDIBEA21EDA5C92CD7A 
Previous Bundle ID? essen. N/A content d | rectly 


Latest Check Restart Bundle ” ... 4727D43B8D47182E7E041BC909C2B454 


- Read-only 


Peer ¢ Site Status Action Status —~Will lose any direct 
d sitet Up paus changes 


Active Bundle ID sssr 2 CP2DBDFOCBCCEDIBEAZTIEDASCS2CD/A 
Latest Bundle lD... nns 2 CF 2DBDFOCBCC6EDIBEA2TIEDASCS2CD/A 
Last Validated Bundle ID ............... 472/D43B8D4/182E/7E04IBCS09C2B454 





Status: splunk show cluster-bundle-status [--verbose] 
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Index-time Configuration Precedence Revisited 





SPLUNK HOME/etc/master-apps SPLUNK HOME/etc/slave-apps 
Oue HRS E 
/default /default 
/ local / local 
/<app name> TE iavstitte = 
/default /default 
/local Manager Node / local Peer Nodes 
Precedence order for indexer cluster peers: Note Ka 
1. slave-apps local directories (cluster peers only) eae a E 
l ASCII sort order. 
2. System local directory —— 
3. App local directories 
4. slave-apps default directories (cluster peers only) 
5. App default directories 
6. System default directory 
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Rollback the Configuration Bundle 


e |f the Push action fails, you can restore to the previously running state 
- Rollback command restores the peer nodes to their previous state 
- After the rollback, fix the problem in the master-apps directory 
and re-apply 
e To restore, click Rollback from the manager node 
- Peers can only join the cluster if bundle validation succeeds during restart 
- Rollback toggles only between the most recent configuration bundle and the 
previous bundle 
» This is different than undo pee apace 


e Also can run: splunk rollback cluster-bundle R | | “me | 
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Log Channels for master-apps Activities 


Manager Node ili —Rear Nodes 





component=CMBundleMgr S) 


© o 
validate cluster-bundle component=ClusterBundleValidator 
--check restart Make a bundle component=BundleJob 
Bundle validate dry-run a Download the bundle 
Mark new & previous bundles | 
Validate the bundle 


Compute restart requirement 


show cluster-bundle-status Display status 1 Report dry-run status 


index=_internal sourcetype=splunkd component IN (CMBundleMgr, BundleJob, ClusterBundleValidator) 
Generated for Chng Wei Min (wchng@micron.com) (C) Splunk Inc, not for distribution 
Splunk Cluster Administration 


S lunk turn data into doing“ 92 | | 
Copyright © 2022 Splunk, Inc. All rights reserved | 25 February 2022 


Log Channels for master-apps Activities (cont. ) 


Manager Node Ee cae 
| | component=CMBundleMgr == 


component=ClusterBundleValidator 
apply cluster-bundle Make a bundle p conponent BundieJob 


Bundle Validate o p? Download the bundle 
Mark new & previous bundles Validate the bundle 








Set the new active bundle 


Reload apps to slave-apps 
show cluster-bundle-status Display status o] Check restart requirement 
If needed, rolling-restart Report 





index=_internal sourcetype=splunkd component IN (CMBundleMgr, BundleJob, ClusterBundleValidator) 
| stats count by _time, host, component, event_message 
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Indexer Cluster Searchable Rolling Restart 


e Restarts peer nodes while maintaining the search availability 
- Trades search availability with restart completion time 
- Takes considerably longer to complete because it restarts all nodes one at a time 
- Requires all cluster nodes to be on Splunk 7.1 or higher 













splunk App: Search & Repor... v Administrator ¥ 1) Messages ¥ Settings œ Activity v Help œ Find 


Search Metrics Datasets Reports Alerts sE ER The search process with 
sid=rt_1552094806.268 on peer=idx3 might 
have returned partial results due to a reading 


x > Search & Reporting 


Save As ¥ Close 
New Sea rch error while waiting for the peer. This can occur if 
the peer unexpectedly closes or resets the 
index=_internal metrics | stats count by host connection during a planned restart. Try running All time (real-time) » a] 





the search again. Learn more. 


3/9/2019, 1:27:02 AM 
m 3 errors occurred while the search was executing. Therefore, search res 
Delete All 
e Reading error while waiting for peer idx2. Search results might be InCOTTareves Y YT com wea 1 Te rere 1 Teer pera T ena aea war Y edépLS the connection during a 
planned restart. Try running the search again. If the problem persists, confirm network connectivity between this instance and the peer, and review search.log and 
splunkd.log on the peer to check its activity. Learn More LZ 





Search head displays error messages when the default rolling-restart is in progress 
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Indexer Cluster Searchable Rolling Restart (cont. ) 


e Best-effort search availability — performs a restart one peer node at a time 
- Completes in-progress searches before peer nodes restart 


- Searches that cannot complete within the timeout (default: 180 sec) are retried, timed- 
out or optionally deferred 


» In some cases, they may need to be explicitly retried by user 
— Real time and indexed real time searches continue to run with available peer nodes 


splunk App: Search & Repor... ¥ Administrator v D Messages v Settings ¥ Activity œ Help ¥ Find 














Search Metrics Datasets Reports Alerts U SISIIGH) Â One or more replicated indexes might not be x > Search & Reporting 
fully searchable. Some search results might be 
incomplete or duplicated during bucket fix up. 
For more information, check the cluster 


manager page on the master - splunkd URI: 
index=_internal metrics | stats count by host https://10.0.1.3:8089. All time (real-time) ¥ ra] 


3/9/2019, 12:38:45 AM 


Save As v Close 


New Search 





1169 of 1169 events matched è Š Z Smart Mode v 
Delete All 
Events Patterns Statistics (8) Visualization 


Search is unaffected and continues when the searchable rolling-restart is in progress 
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searchable Rolling Restart (SRR) Options 


s searchable <true|false> splunk rolling-restart cluster-peers 


_ ; -searchable <true/false> 
- false = ignore all options Sa 


e force <true|false> -restart_inactivity_timeout <sec> 
- true = proceed with the searchable rolling R secc] 
restart despite health check failures 
- Specify these additional parameters 
>» restart_inactivity_timeout <sec> 


Timeout before manager node gives up on the 
current peer and moves on to the next peer 
(default: 600 sec = 10 min) 


: decommission force timeout <sec> 


Timeout for a peer to finish and voluntarily 
restart before it is forced to restart G When in doubt, use force=true to avoid 
(default: 180 sec = 3 min) SRR getting stuck forever 


Restart peers despite unmet search and replication factors 


Note G 
The force=true option can 
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Index Cluster Rolling Restart x 


Are you sure you want to initiate a rolling restart? This action puts the 
cluster into maintenance mode. Learn more. A 


Searchable d 


Restart peers with minimal search Interruption. 


Tracking Rolling Restart Progress 





ê We D | J | splunk Apps ¥ Administrator ¥ D Messages v Settings v Activity » Help «œ Find 


R R 


Indexer Clustering: Master Node 


Edit v More Info ¥ Documentation 2 


- Health status 
A This cluster is in maintenance mode. A rolling restart of cluster peers was initiated Learn more. 2 = 
= A Mal ntenance Peer Restart Progress 


Restarted 1/4 


message 
Fai v All Data is Searchable -7 Search Factor is Met Y Replication Factor is Met 
7 P rog fess D a r 3 searchable 1 not searchable 5 searchable 0 not searchable 
Peers Indexes 


= Ad d itio N a | v Rolling Status Messages 


m ess A g es [Mon Mar 11 17:03:45 2019] Force peer=F0483B8D-BBE0-4D93-957C-63FBB40816EA peer_name=idx2 to restart as it exceeds threshold 


e CLI 
filter Q 10 per page v 


S p l u n k S h OW i Peer Name $ Site $ Fully Searchable $ Status $ Buckets > ° 


Peers (4) Indexes (5) Search Heads (4) 











> idx3 site2 v Yes Up 52 
cluster-status y 

> idx4 site2 Z Yes Up 24 
= =yer bo se > idxt site1 A No Restarting 34 

> idx2 sitet v Yes Up 44 
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Temporarily Taking a Peer Offline 


e Before taking a peer node offline, make sure the cluster has enough 
peer nodes to meet the replication policy 


- [o minimize bucket fixup activities, take down only one node at a time 


e To bring down a peer temporarily, run: splunk offline 
- [he manager node delays bucket-fixing and just maintains a valid state 
- The peer node must be back online within 60 seconds (by default) 
» If the peer node does not return, the manager node initiates bucket fixup 
activities 
- |f you need more time, extend the wait time on the manager node: 
splunk edit cluster-config -restart_timeout <wait_seconds> 
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Permanently Decommissioning a Peer Node 


e To decommission a peer node permanently, run: 


splunk offline --enforce-counts 
- Does not shut down until all search and remedial activities have completed 


- Can take quite a while because the cluster must come to a complete state 
http://docs.splunk.com/Documentation/Splunk/latest/Indexer/T akeapeeroffline 
e The manager keeps the peer node information even when it is decommissioned 
- If you want to remove it from the manager node permanently, run: 
splunk remove cluster-peers -peers <guid>,<guid>,... 


Buckets + 


Fully Searchable = Status = 


Peer Name ^ 
idx Graceful shutdown 0 
Location... sss sss 10.0118189 
Last Heartbeat ........ 3/11/2019, 10:22:09 AM 
Replication Port...... 9100 

—> Base Generation ID 18446744073 709552000 
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Adding or Moving Peer Nodes 


s To add a peer node to an existing cluster, use the standard peer 
node configuration procedures 
s TO move a peer node to a different site: 
1. Take the node offline with splunk offline --enforce-counts 
2. Move the server and have it join the new site's network 
3. Delete or uninstall Splunk Enterprise from the peer 
» Remove the index directories, if necessary 
4. Reinstall Splunk Enterprise 
5. Enable it as the peer node to the new site 
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Indexer Cluster Peer Status 


The manager node dashboard reports several possible status 
conditions for peer nodes 


http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Howtomonitoracluster 


Temporary Permanent 


Not Recommended 


splunk stop splunk offline splunk offline --enforce-counts 


ShuttingDown ReassigningPrimaries Decommissioning 


1 
S10) 0) 01510 ShuttingDown GracefulShutdown 
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Decommissioning a site 


e After a site upgrade/migration, you can decommission a site that is 
no longer in use 


e Prerequisites 
- The cluster must be in a complete state 
- The manager node must not be a part of the decommissioning site 
- There must be at least one searchable copy of each bucket on other 
remaining sites 
e NOTE: 
- Any prior standalone and single-site buckets will be lost 


- Decommissioning starts bucket fixup activities and can take a 
considerable amount of time 
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steps for Decommissioning a Site 


1. Reassign site search head(s) to a remaining site 


2. If forwarders are associated with the decommissioned site using indexer 
discovery, re-assign them to a remaining site 


3. Run splunk enable maintenance-mode on the manager node 
4. Update the following attributes in the manager's server.conf: 
- available sites 
- site replication factor Noss o 
S Site_mappings is a way to handle 
- Site search factor buckets that are stuck during fixups 
— site mappings due to decommissioned origin site. 


5. Restart the manager node 


©. Run splunk disable maintenance-mode on the manager node to start the 
fixup process 


T. Decommission each peer with splunk offline --enforce-counts 
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Site mappings Example 1 
Replace old peers in site2 with a new site 


sitet A 





E Sites T 
Se e pa a on a r O a il, Towels 2 
S oea E a or a oea 


SSR Slices = olee ll gasses, o EE 

SE ea ae HE e a Stel Stes eoe a 2 
S o earn aea = a a Sitel T sites Trota 
site_mappings = site2:site3 
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Site mappings Example 2 


Replace all peers in site1 & site2 


aweltleiole Si cos = Sie, sits 
SLs “oS olueacem a r = weno mime il, Towels 2 
T > Oiele ine l, Onehs 2 


svelllelole Sice6S = TRE , eho! 

See Wejol R cio werewoie = Cie Cling l sites: Ce E were Ils 2 
S SeecCh recur o on e sites T sitea: al wee le 
Site_mappings = sitel1:site3,site2:site4 
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Site mappings Example 3 


Migrate from multisites to a single site4 


awed lee Sites a Siue2, SiiSS 
SLES “oS OlLGecLem KeCtcie = ormigains lh, smiles I, sits se i, owes 2 
SIU Siseiccd ractor = Gelcuim: I, esmte ll, e i) tees 2 


Site mappings = site2:site3 


ave laljle a e = Ss: O a Sie 

SE Well Cercle wee = Crile s4 jee 2 eis ks 2 
S SScieell welewore = ice ine A epics peewee 2 

Site mappings = default_mapping:site4 
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Note Ka 


If a site used in a mapping is later 
decommissioned, its previous mappings 
must be remapped to an available site. 


istribution 


Splunk Cluster Administration 
Copyright © 2022 Splunk, Inc. All rights reserved _ | 25 February 2022 


Bucket Status and Fixup 


e Bucket fixup is the process of manager node trying to meet the search and 
replication policies 
- Copies both rawdata and index data to meet Note G 
the replication factor and search factor 
A massive fixup after the loss of peers can saturate the 


e [he Bucket Status page displays fixu © tasks network and adversely affect indexing tasks. 
Consider throttling non-hot replication bandwidth with 


d nd thel r states the max_nonhot_rep_kBps setting in server.conf. 


Fixup Tasks - In Progress (0) Fixup Tasks - Pending (4) Indexes With Excess Buckets (2) 





Here is a list of buckets waiting to be fixed. 


Select Fix-up Category: Search Factor (0) Replication Factor (4) Generation (0) Time in Fixup More Than: Unconstrained + 

Bucket Name Action Index Fixup Reason Time in Fixup Current Status 

_audit’3*A0823F51- Action » _audit heartbeat timeout or O minute(s) Missing enough suitable 

B6A6-49D2-¢°° ~~ decommission complete candidates to create 
View Bucket Details l l 

5DB65 replicated copy in order to 
Roll meet replication policy. 

Missing={ site2:1 } 

Resync 

_audit’4*AO§ _audit heartbeat timeout or O minute(s) Missing enough suitable 
Delete Copy 


B6A6-49D2-° decommission complete candidates to create 
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Cleaning Up Excess Bucket Replicas 


e There might be extra copies of buckets in a cluster when a peer node rejoins a 
cluster after fixup has occurred 
- This does not affect searching, but consumes storage space 


s To determine and manage these excess buckets, use Splunk Web or CLI on 
the manager node 











Peers (4) Indexes (2) Search Heads (3) | 
Bucket Status C e a | 
splunk list excess-buckets [index | Note 
Legacy buckets (from migration) 
tot are considered to be excess- 
Fixup Tasks - In Progress (0) Fixup Tasks - Pending (4) Indexes With Excess Buckets (2) buckets but will not be removed. 


splunk remove excess-buckets [index | 


Here is a list of indexes with buckets exceeding the replication or search factor. 


Remove All Excess Buckets 






Index Name < Buckets with Excess Copies $ Buckets with Excess Searchable Copies $ Total Excess Copies $ Total Excess Searchable Copies $ Action 








_audit 14 14 14 17 Remove 
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Monitoring Indexer Cluster with Metrics 


e Internal logs under clustering may get rotated too fast and a diag may not 
provide sufficient information 


- For more insight, search 


¢metrics.log can provide service loads and job activities 


index=_internal sourcetype=splunkd metrics name=cm* dog Groups $ 
| stats values(group) AS groups by name cmmaster jobs 


cmmaster_endpoints subtask_counts 





subtask_seconds 
Cluster service loads based on the number of endpoint e EERI 
accesses and their response times 
cmmaster_service subtask_counts 
subtask_seconds 
Scheduled tasks, finished tasks, tasks still pending caslave Jobe 


cmslave_endpoints subtask_counts 
subtask_seconds 


cmslave_executor executor 





Peers have their own corresponding metrics 


cmslave_synchronous jobs 





cmslave_synchronous_executor executor 
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Primary Rebalancing 


e Primary rebalancing balances the search load across all peer nodes 


- The manager identifies duplicate searchable buckets and attempts to reassign 
approximately the same number of primaries on each peer 


- Does not actually move searchable copies to different peer nodes 


e Occurs independently for each site in a multisite cluster 
- Does not shift primaries between sites 


e Primary rebalancing triggers automatically when: 
—A peer node joins or rejoins the cluster 
— Manager node rejoins the cluster 
- Rolling restart completes 
e Can also manually trigger the rebalancing process with the REST endpoint 
on the manager: 
services/cluster/master/control/control/rebalance_primaries 
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Data Rebalancing 


e Uneven bucket distribution doesn't utilize storage optimally and can cause higher 
load on certain peer nodes 





Duration (seconds) Component invocations Input count Output count 

e Uneven bucket distribution can occur: E 559 dispatch stream remote -20767549 
= After adding new peer nodes — > 572 ni stream.remote. idx! | — > 12,316,674 
dispatch. .stream._remote idx? - 8,431,689 

=] When forwarding data IS skewed dispatch.stream.remote.idxd = 9.590 

- After frequent bucket fixups due to outages 00 path stream.remate idx “9596 


e Data rebalancing redistrioutes the number of bucket copies per index 
- Balances the storage distribution across the peer nodes 
- Operates on only warm and cold buckets, NOT hot buckets 
- Rebalances all non-searchable, searchable and primary buckets 


e |n multisite cluster, data is rebalanced within a site as well as across sites 


Peer Name ^ Site Fully Searchable Status Buckets © ° 













idx sitel v Yes Up 68 
idx2 sitel v Yes Up 14 
idx3 site2 v Yes Up 40 
idx4 





Viviwiw ls 





v Yes 
et TTT 
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Data Rebalancing (cont. 1 


Edit + More Info < Data Rebalance 


Node Type 


Threshold * . (default) 
Configuration Bundle Actions 


Data Rebalance Max Runtime ? optional 


Rolling Restart 


Index ? me 
> splunk edit cluster-config -rebalance_threshold 0.90 pan UI 


> splunk rebalance cluster-data -action start 
[-searchable true] [-index <idx>] [-max_runtime <min>] 

> splunk rebalance cluster-data -action status 

> splunk rebalance cluster-data -action stop 


Searchable 7" g 


Data has newer been rebalanced since cluster master restart 





e [his process impacts search performance S ES 
- Can run it in searchable mode 
» Takes longer to complete and requires more storage space 
- The goal is to achieve a practical balance, not a perfect balance 


— http://docs.splunk.com/Documentation/Splunk/latest/Indexer/rebalancethecluster 
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Splunk Cluster Administration 


Report and DM Acceleration Replication 


e By default, indexer clusters do not replicate report acceleration and data 
model acceleration summaries 


- Only primary buckets have associated summaries 

e To enable summary replication, run this on the manager node: 
splunk edit cluster-config -summary_replication true 
Or, set summary_replication = true in server.conf [clustering] 


Summary_replication = true 





e All searchable copies contain all the replicated 
Summaries for that bucket 


Note Eä 
- For hot buckets, the cluster creates a summary for With summar repicatonenabled, 
each searchable copy summary-generating searches use 
, OOE more resources across the cluster. 
- For warm/cold buckets, the cluster replicates to fill in a a 
complete. 


any missing or out-of-date summaries 
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Tsidx Reduction in Clustering 


e You can trade some search performance for significant index size reduction 
with the Splunk storage optimization feature 


- Only non-hot buckets old enough quality 
- Certain metadata files get removed from the optimized buckets 
- Produces minified buckets ending with a .mini.tsidx extension 


e Minified buckets are searchable buckets 


- |f the cluster needs a new searchable bucket, it first attempts to replicate from 
the existing searchable copy 


- |f the cluster has no searchable copy, it rebuilds a full bucket and then minifies 
it during the next minification schedule 


- NOTE: 
» Commands such as tstats and typeahead do not work on minified buckets 
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Peer Detention — Automatic 


e A peer enters the automatic detention state when the minFreeSpace 
threshold in server. conf is crossed 


e Default value is 5GB 
e When a peer is in automatic detention: 
-all indexing is stopped (internal and external) 


- replication is stopped 
- stops participating in searches 
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Peer Detention — Manual 


e A peer can be put into a detention state manually. When a peer is in manual 
detention: 


- stops replicating data from other peer nodes 


- optionally disables external data ports, causing it to stop indexing most types 
of external data 


- continues to index internal data 
- continues to participate In searches 


e Use cases for manual detention: 
- Before decommissioning a peer, make it available only for searches 
—~Preempt automatic detention 
- Diagnose a suspect peer by blocking indexing and replication 
- Force new data to go to other peers 
- Slow the disk writes by only allowing indexing, but not replication 
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Enabling Manual Detention 


e From the manager node: 


splunk edit cluster-config -manual_detention onion ports enabled loft) 
-peers <guid1>, <guid2>,... 


From the peer: 
Splunk edit cluster-config -manual_detention [on|on_ports_enabled| off] 


e on closes the TCP, UDP, and HEC ports 
- Disables indexing and replication for all network-based or remote (forwarder) data inputs 
— Still indexes local monitor and scripted inputs 


e on_ports_enabled blocks incoming replication, but continues to index 
off disables the manual detention 
Checking the detention status 


- On the manager node: splunk show cluster-status 
Settings > Indexer Clustering > Peers 
— On the peer node: splunk list cluster-config 
- On monitoring console: Indexing > Indexer Clustering > Indexer Clustering: Status 
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Key Indexer Cluster Maintenance Commands 


e Helpful CLI commands to run on the Manager Node 
splunk help clustering 


splunk rolling-restart cluster-peers 

splunk [enable|disable|show] maintenance-mode 
splunk set indexing-ready 

Splunk validate cluster-bundle 

splunk show cluster-bundle-status 

splunk apply cluster-bundle 
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Indexer Clustering Health Report 


e Reports every 20 seconds by default 


e feature:cluster_bundles reflects whether there are validation errors in the last bundle 
that were pushed to cluster peers (yellow only) 


e feature: data durability reflects whether or not: 
- The configured replication factor is met (red only) 
- The configured search factor is met (red only) 
e feature:data_searchable turns red if one or more buckets lack a primary 
e feature: indexers tracks whether any peer nodes are in detention mode 
- Yellow if in manual detention 
- Red if in automatic detention 
e feature:missing peers tracks any peer nodes that are in transition 
- Yellow if nodes are stopping, stopped, decommissioning, pending or restarting 
- Red if nodes are down 
e feature: indexing ready stays green when the cluster is functional 
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Monitoring Clusters with Monitoring Console 


e Enable Monitoring Console (MC) on the 
system that has the best vantage point of the 
distriouted deployment 


- MC is the search head of all search heads 
e Enable only one MC instance in the entire 
deployment 

- A dedicated search head that only 
administrators can access within the indexer 
cluster 

- Deployer or dedicated license manager 

- Manager node 


- DO NOT enable on a search head cluster 
member or on a peer node 
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Monitoring Console Setup Prerequisites 


e Users require admin. all objects 


Capability to configure the MC outputs.conf 
e Each instance must use a unique cone ern a 
servername and default-hostname "nodes = false 
e ' : . PE CEOE] 
Platform instrumentation is enabled for STT 
every instance (UF optional) forwardedindex.filter.disable = true 
e Optionally on the cmanager: DACCE CARCI SHEE, T AL 
splunk edit cluster-config [tcpout :default-autolb-group] 


-cluster_label idxc1 server=idxl:9997,idx2:9997, idx3:9997,idx4:9997 


e Forward all indexes (including Ben tire 
internals and Summaries) from search 
heads and 
the cmanager node to the indexing 
tier 
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Post MC Contiguration Checklist 


e Add all instances, except peer nodes, as search peers to the MC 


e After enabling the MC to run in distributed mode, verify that all instances 
are discovered and the server roles are correct 
- Make sure only peer nodes (indexers) are marked as indexers 
- A search head that is also a license manager should have both roles marked 
- |f the correct roles are not selected, click Edit and update 


e Use custom groups to organize related components 
- The custom groups are used for view selection in the MC dashboards 


e For any changes in the environment, return to Setup, Note F 
l To pr te the changes, alw 
check the server roles, and update if necessary ck eee” 
etup page. 
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Indexer Cluster Dashboards in MC 


Overview Health Check Instances Indexing < Search < Resource Usage < Forwarders * Settings < 


Indexer Clusterin "omme: 


Indexer Cluster Indexer Clustering 


: Indexes and Volumes 
idxc-os_user cae en Sens 


Inputs 


License Usage 


Warning and Error Patter 
SmartStore 





e Indexer Clustering: Status 
- Provides the same information as the manager's indexer clustering page 


e Indexer Clustering: Service Activity 
- For an ideal healthy cluster, most of the panels should be blank 
— The trending down of fixup tasks and service jobs count is normal 


- Pay attention to an increasing trend on pending tasks and jobs 
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Lab Exercise 4 — Monitor CM Service Activities 


e Time: 25 - 30 minutes 


e Tasks: 
- Stage an app and deploy it to the peer nodes 
- Disable indexing on cmanager and dserver 
- Enable the Monitoring Console to run in distributed mode on dserver 
- Monitor the indexer clustering service activities from Monitoring Console 
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Module 9: 
Forwarder Configuration 
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Module Objectives 


e Use indexer discovery to configure forwarders in a clustered 
environment 


e Describe optional indexer discovery settings 
- Polling rate 
- Weighted load balancing 


e Optimize indexing loads with volume-based load balancing 
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Best Practices for Configuring Forwarders 


e Forwarders are not required to be configured for clustering 
e Enable load balancing and indexer acknowledgement in outputs. conf 


[tcpout:indexers] 
SeiOyaie = <[eier Mocs lslieit> 


useACK = true 





outputs.conf 


- The receiving peer (origin) tracks the write state of indexing data when 
useACK is enabled 


- Acknowledges when indexing data is written to a bucket and sent to 
replicating peers 


- Any node or network failure causes a bucket to roll and triggers a fixup 
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Forwarder Management Challenges 


e ln a non-clustered environment, Splunk administrators must track target 
server changes and update the forwarder outputs.conf settings 
manually 

- A Static list of indexers has to be deployed to each forwarder 
- [arget indexer changes require restarting the forwarders 





IDX1 IDX2 IDX3 New 
[tcpout:indexers] (ee! = = 
S Te Z 
Seieveis EEE E T O a aon 
useACK = true 9997 9997 >??? 
ee yo 
Forwarder 
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Using Indexer Discovery 


e Designed for dynamic environments where capacity is added and removed 
on demand 


- Minimizes the forwarder restarts 






- Scales well and easier to manage | Site 1 | Site 2 3 

—~ Reduces the load on DNS servers Manager Peer! Peer2 ! ! Peer3 Peer4  Peer5 i 

! Node Ioi 

- Can use weighted load balancing TT ) |B S Bie g S ==] | 

- Can be site-aware {HE a=) (E Cs 

w 9997 9997 ! ! 9997 9997 9997 | 

© Peers report their receiving ports to manager node Og ene eon Neh laran 
© Forwarders poll manager node to get the latest list of en ie ae 
peer nodes a o- © 


© Forwarders send data to the peers in the list 


© A peer can be added or removed without affecting 
the forwarder configurations 


Forwarder with Indexer Discovery 
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Configuring Indexer Discovery — Manager Node 


e No changes to the peer nodes to receive data from forwarders 


splunk enable listen 


e No CLI support for enabling indexer discovery 
e Enable indexer discovery on the manager node 


[indexer discovery] Manager node server.conf 
pass4SymmKey = AnotherSecret 


- The pass4SymmKey specifies the security key used to communicate between 
the master node and the forwarders 
» Must use the same value on all forwarders and the manager node 

- The pass4SymmKey here is NOT the same cluster secret 


» For better security, use a different value than the one used between the 
manager node and the peer nodes 
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Configuring Indexer Discovery — Forwarders 


Each forwarder's outputs.conf 


ET SR forwarder) e Indexer discovery by default IS not 


indexerDiscovery = clusterl 


useACK=true site-aware (site = site@) 
[indexer_discovery:cluster1| - Forward to peers across all sites 


Mes Cm sie eae a —Maswok >. 6.06 9 


pass4SymmKey = AnotherSecret e |f you want a forwarder to be site- 
aware, assign a site-id 





If both manual and indexer 


discovery attributes are set, [general] 
indexer discovery takes site = <site-id> 
precedence Each forwarder's server.conf 


- Forwarders poll the manager at 
set intervals to receive the most 
recent list of peers 
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Indexer Discovery Option — Polling Rate 


e The manager determines a polling interval dynamically based on the 
number of connected forwarders and the polling rate 
—poll_interval (seconds) = #_of_forwarders / polling rate + 30 


-polling rate is a fixed factor that the manager uses to calculate the 
polling interval 


» Set a factor between 1 and 10 (the default is 10) 





# of forwarders | polling_rate | polling_int_sec | polling_interval Manager Node server.conf 


100 1 100/1+30=130 2 min. 10 sec. [indexer discovery] 
100 10 100/10+30= 40 40 sec. passar unmet Line 
1,000 10 1000/10+30=130 2 min. 10 sec. 


polling rate = <1-10> 
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Balancing Indexing Loads 


e Evenly distributed index data greatly improves the search 
performance 
e The default load balancing on the forwarder is based on time 
- Time-based forwarding alone cannot distribute data evenly across peers 


e Other options: 
- Weighted load balancing 
-Volume-based data forwarding 


- Event breaker 
» Discussed in Splunk Enterprise Data Administration course 
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Using Weighted Load Balancing © 


e With indexer discovery, the manager can adjust the server list based 
on the peers’ relative storage capacities 
- More frequently selects peers with higher advertised capacity 
selection ratio = peer_capacity / cluster wide capacity 


e Peers with more storage capacity, typically new peers, can become 
oreferred search peers because a larger percentage of recent data 
will be indexed on these peers 


Enable it on the manager's server.conf Specify the advertised capacity peer nodes (optional) 


[icles 6 Cli Scoveiny | fee) Pe sterg] 
indexerWeightByDiskCapacity = true Z> ae e E e a e a H oo 
co C 
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Using Volume-basead Data Forwarding 


e Data rebalancing is NOT bucket age aware 


- Recent data can still concentrate on particular peers, while the overall buckets 
are evenly distributed 
- Therefore, it is preferred to balance the data coming from the forwarders 


e With volume-based forwarding, a forwarder can distribute more evenly 
- Sends a predefined amount of data before switching to another peer 


e To enable: 
- Set autoLBVolume=<size_in_bytes> in outputs. conf 
» Set the autoLBVolume size in multiples of 64KB (65536) 
- Set EVENT_BREAKER_ENABLE=true and Its associated attributes in 
props.conf 
e autoLBVolume and autoLBFrequency settings work in conjunction 
- Whichever threshold is met first, switch to the next peer node 
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Forwarder Site Failover 


e You can configure site-awareness for forwarders in a multisite cluster 
- What happens to forwarding in case of a site failure? 


e Solution: enable the forwarder site failover 


- Forwarders must use indexer discovery 
- Send data to a secondary site when all nodes on the primary site are down 


- When a node from the primary site is back online, forwarding resumes to the primary site 


e To configure, run from the manager node: 
splunk edit cluster-config -forwarder_site failover <primary>:<failover> 





IIE REG A tara cl [general] 
mode = master Site = sitel 
multisite = true 
available sites = sitel,site2, site3 Each forwarder's server.conf 
forwarder site failover = sitel:site3,site2:site3 
[indexer discovery] 
ES Seon = Widel over Manager node's server.conf 
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Indexer Discovery Log Channels — 


e Search the manager node's indexer discovery log channel for any errors: 
index=_internal component=CMIndexerDiscovery 


e Search the forwarder's splunkd.log for any indexer discovery events: 
index=_ internal component IN(IndexerDiscoveryHeartbeatThread, 
HttpPubSubConnection, TcpOutputProc ) 


e Search the forwarder's metrics .log for data distribution and failover: 


index=_internal host=uf component=Metrics 
group=tcpout_ connections 
| timechart span=1h sum(kb) by dest TL 
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Example: Indexer Discovery Site Failover 


index=_internal host=uf Metrics group=tcpout_connections | timechart span=1h sum(kb) by destPort 


10,000 


1,000 





Wai Rui +s 
TK] aant 


12:00 AM 12:02 AM 12:04 AM 12:06 AM 12:08 AM 


1. A volume-based forwarding UF was sending to site1 Note ial 
(91 97 & 9297) until this point In this example (your lab 
l l l environment), the search uses 
2. Detected targets in site1 are no longer available and destPort, instead of dest Ip. 


switched to the failover site (9397 & 9497) 


3. The original site has recovered and the forwarding 


targets gradually transitioned back to the site1 peers 
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Lab Exercise 5 — Configure a Forwarder 


e Time: 30 - 35 minutes 


e Tasks: 


- On the manager node, enable the indexer discovery option with 
forwarder site failover 


- Configure the deployment server 

- Enable the deployment client setting on the forwarder 
- Update the instance server role in Monitoring Console 
- Verify the forwarder app deployment 

- Test the forwarder site failover scenario 
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Lab Exercise 5 — Configure a Forwarder (cont.) 


Indexer Cluster (10.0.x.1) 


cmanager 
8089 













SSH you@10.@.x.1 


dserver | 
(Deployer): 
8189 ! 





SSH you@PubLlic_DNS = 
http://{Public_DNS}/dserver + 


SSH you@10.@.x.2 


Misc-Server 
(10.0.x.3) 


X = Your student ID 
8?89 = splunkd-port 
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Lab Exercise 5 — Configure a Forwarder (cont.) 


Indexer Cluster 






http: //{Public_DNS}/{splunk_server} 
For example: 

http://{Public_ DNS}/sh1 
http://{Public DNS}/sh2 


Your 
Browser 





Public IP = Same as your Misc-Server 
splunk_server = Splunk server name 
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Module 6: 
search Head Cluster 
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Module Objectives 


e Describe search head cluster architecture 

e Configure a search head cluster 

e Identify the captain and monitor cluster status 
e Describe optional configuration settings 
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search Head Challenges 


e Individual search heads are single points of failure 

e Maintaining consistent configurations across multiple search heads 
e Increasing concurrent users and searches 

e Managing large numbers of scheduled searches and alerts 
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Addressing Search scaling | 


e As the number of simultaneous searches increases, you have several 
actions to mitigate performance issues 
e Best practices: 
—Increase the number of search peers (indexers) 
- Optimize scheduled searches to run on non-overlapping time slots 
—|solate scheduled searches, real-time searches, and ad-hoc searches 
- Limit the time range of end-user searches 
- Add more search heads 
- Configure user roles to limit the number of concurrent real-time searches 
- Implement search head clustering 
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Benefits of Search Head Clusters 


e Horizontal scaling 
- Members are interchangeable 
» Add more SHC members any time 
- Commodity hardware 
- Seamless user experience 
» Easy to on-board users and apps 


e Search high availability 
- Search job failure aware and reschedule 
- Reliable alerts 
e No single point of failure 
- Dynamic captaincy 
e Dedicated configuration bundle management 
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search Head Cluster Terminology 


e Raft distributed consensus 
- Captain/Member (instead of manager/peers) 


e Replication factor 
- Applies only to search artifacts 


e Deployer 


Load Balancer 





-| 
Peai 
E 
E 
< 





- NOT the same as the deployment server i Ha 
i E a R 
e How does it work? | Distributed 
„Consensus, |. 
- Members of search head cluster elect &------------- 3 





the captain dynamically 


- Replicates configuration changes to 
all cluster members 


- Captain schedules and manages searches 


N Search Head Cluster ,/ 


V 
V 
V 
V 
V 
N, 
N, 
N, 
N 
NS 
N, 
NS, 
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How Does SH Cluster Scale Search Capacity? 


e Load-based scheduling heuristic 

- Captain is the only job scheduler 
» Captain establishes authority and is also a member 
» The normal scheduler on all members is suppressed 
» Captain schedules and delegates jobs to its members 

- Captain maintains global Knowledge of all search jobs 
» Members regularly report their job loads to the captain 
» Ad-hoc and real-time search results (artifacts) are not replicated 


>» Saved and scheduled search artifacts are replicated per search head 
cluster replication factor 


- Captain directs which member to contact to access search results 
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Report Scheduler 


e Without SHC, a search head can defer and skip search jobs if: 
- Search head restarts (could cause a gap in Summary index) 
- Search head encounters resource constraints (max. concurrent search limit) 
» A user performing prolific, ad-hoc searches can overwhelm other searches 
: Infrequent long-running searches can starve out frequent short-running searches 
» A deferred job is implicitly retried (repeated for the duration of its window) 


e With SHC, the captain mitigates starvation and recovers missed jobs 


- Implements a heuristic approach based on job load, priority scoring, search 
history introspection, and schedule window 


http://docs.splunk.com/Documentation/Splunk/latest/Report/Configurethepriorityofscheduledreports 
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How Does Cluster Provide Always-On Services? 






Load stats 


e Auto SH captain failover 3. . 
| | cheduler Running jobs 
- Elect new captain via Raft eee 


- Persists its records in 
var/run/splunk/_raft/<server>/log New Captain 





- Members register their list of artifacts, 
running jobs, alerts, and search load 
statistics to a new captain 


- New captain enables its scheduler fe eee 

- New captain executes fixups If needed Old Captain 
Note G 
Use DNS names when initializing 
members. 
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search Head Cluster Key Considerations 


e Always use new Splunk instances 
- Must have at least three members 
- You cannot upgrade from an existing SH or a member of SH Pool 
» Migrate the configurations after the SH Cluster is up 
e Same hardware requirements as the dedicated search head 
- Use identical specifications for all members (bare metal or VM) 
» Works on all operating systems supported for Splunk Enterprise 
» Same version of Splunk Enterprise 
e Synchronize the system clock on all members including the indexing 
layer 
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sharing Splunk Server Roles 


BEST PRACTICE: 


e Disable local indexing and forward everything, including all internal 
indexes, to the peer nodes (discussed later) 


e A search head cluster member should not have any other server 
roles 


WARNING: 


e A member cannot be a search peer to another search head 


- Exception: when it is configured to be a monitored instance of Monitoring 
Console running on other instance 
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search Head Cluster Ports 


pass4SymmKey 
for 
Search Head Cluster 


pass4SymmKey 
for 
Indexer Cluster 





Management (splunkd port) 
Replication (index replication port) 
ii Replication (search artifact replication port) 


Seen Search request/results (distributed search) 
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Deploy a Search Head Cluster 


1. Install Splunk Enterprise and set admin password 
- Recommend LDAP/SAML 


2. Bring up and initialize all SH cluster members: 


splunk init shcluster-config -mgmt_uri https://SH2:8089 
-replication port 9200 -secret shcluster 


3. Assign one of the members as the captain and set a member list: 


splunk bootstrap shcluster-captain -servers_ list 
https: //SH2:8089,https://SH3:8089, https: //SH4: 8089 


4. Check search head cluster status: ves 


splunk Show shcluster-status Search head cluster configuration is in 
SPLUNK_HOME/etc/system/local/server.conf. 


splunk list shcluster-members 
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Connecting a SHC to Non-Clustered Indexers 


e You have two ways to search non-clustered indexers 
—|ndividually add the search peers from each SHC member, or 
- Enable search peer replication 


» Add the search peers to one SHC member and let the SHC replicate the 
peer configurations to all SHC members 


» All SHC members gain access to the same set of search peers 


Neen Stel Kemal ies) 
disabled = false 


replicate search peers = true server.conf of each SHC member 





- Once enabled, you can add new peers with CLI, Web, or REST API 





splunk add search-server https://<peer>:8089 -remoteUsername <user> -remotePassword <pw> 
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Connecting a SHC to Indexer Clusters 


e SHC members do not have site awareness 
- No site-by-site artifact replications 
- Using site@ can provide a seamless search experience 


e Configure each SHC member as a search head on an indexer cluster 


- The search head members get their list of search peers from the manager 
node of the indexer cluster 


- Connecting to a single-site indexer cluster: 
splunk edit cluster-config -mode searchhead -master_uri https://10.0.1.3:8089 





-secret idxcluster 


- Connecting to a multisite indexer cluster: 


splunk edit cluster-config -mode searchhead -master_uri https://10.0.1.3:8089 
-secret idxcluster 
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Search Head Cluster Member server.cont 


[general] 
License manager password pass4SymmKey = $1SttbJh5nUk5AM 
serverName = sh2 
Indexer cluster site association site = sited 
If connected to an indexer cluster [clustering] 
Mieisicere wie = Imeisoss/ (UIs 0 L, seeces 
mode = searchhead 
multisite = true 
Indexer cluster password pass4SymmKey = $7Sw/WOkx7n6jztOoNsisPQwfB+t 
Search head cluster artifact replication port [replication_port://9200] 
[shclustering] 
disabled = 0 
Member's self identifying address to the SHCluster mgmt_uri = https://sh2:8089 
Search head cluster password pass4SymmKey = $7$yoi+tTs+MTFDA+I1KbDczLeEY 


Generated search head cluster ID id = 571B9C60-66EA-4B9F-8562-27B62E93E31F 
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Members: 


Checking SH Cluster Status 


> splunk show shcluster-status 











SOn SL 101 sh2 label sh2 







dynamic Captain : 1 lact Conf replication : Mon Mar Il 21r s4 72052019 
elected eapraln Mon Marie? P33 eaaa mome hes a PEE 47/7 E EE EE 
el OOPER AO HT mone area a eE E E o 





stacus s Up 


imnicialized ilag s 1 
label : sh2 
momet uri e L EEE a29 













sh3 label sh3 
last_conf_replication : Pending 
maae Wied e ee 7 Oy 
mome riala T o 
stacus s UWS 





min peers Joined flag 
rolling rsstart flag 
service_ready flag : 






ee 
LA LA 
N N 






sh4 Label s sn 
last_conf_replication : Mon Mar 11 21:54:39 2019 
mon are I T 
mgmt mri elles ER T R E eda 
SE clei TH 


Note 


A raft metadata corruption can cause the captain 
election to fail. To confirm, look for ERROR 
SHCRaftConsensus in splunkd. log. 





> splunk clean raft 
- When SHC can't elect a captain, run on all members before bootstrap 
- When SHC has an active captain but a member can't join, run on the failing members 
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Splunk Web Settings Menu Changes 


KNOWLEDGE 


e When search heads become members of a 
search head cluster, the Settings menu in 
Splunk Web changes 

— Hides all non-replicable options 
» You can unhide them, if necessary 
- Enables the search head clustering UI on all 
SHC members 


» Able to perform rolling restart, manual 
detention and captaincy transfer 


- |f you need to make changes to the settings 
that are hidden, use the deployer to push the 
underlying changes 





Searches, reports, and alerts 
Data models 
Event types 


Tags 
Fields 


Lookups 


User interface 
Alert actions 


Advanced search 
All configurations 


SYSTEM 
Health report manager 
Instrumentation 

Workload management 


DATA 


Report acceleration summaries 


DISTRIBUTED ENVIRONMENT 


Search head clustering 





USERS AND AUTHENTICATION 
Access controls 
Tokens 








Search Head Clustering 


Monitor and take action on yo 


Actions Stat 
actio Up 
ctions Up 

| M | Detent | Up 


Transfer Captain 
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ur search head cluster. Learn more 2 


Q 20 per page v 

US Z Roles Last heartbeat sent to captain $ 
Captain 3/11/2019, 3:36:42 PM 
Member 3/11/2019, 3:36:44 PM 
Member 3/11/2019, 3:36:42 PM 


Begin Rolling Restart 





Splunk Cluster Administration 
Copyright © 2022 Splunk, Inc. All rights reserved _ | 25 February 2022 


Configuration and Artifact Replication 


e Captain orchestrates both configuration and artifact replication 


e Knowledge object configurations (replicated to all members): 
- Changes made via Splunk Web, CLI, or API are replicated 
- Direct . conf file edits must be implemented using the deployer 
- More details are discussed in the next module 


e Artifacts (based on the search head cluster replication factor): 
- Only the artifacts resulting from scheduled reports are replicated 


- Real-time and ad-hoc search artifacts are not replicated, instead they are 
proxied (discussed later) 


- Captain enforces artifact fixups according to its replication policy 


http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/HowconfrepoworksinSHC 
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Roles and User Account Replication 


e Use any of the available authentication methods 
- Splunk native authentication 


» Automatically replicates the underlying . conf files and 
SPLUNK_HOME/etc/passwd 


- SAML authentication 
: Only replicates authentication. conf 
» Must use the deployer to push the certificates 
- Scripted authentication 
» Must use the deployer to push both the script and authentication. conf 


e User configurations are automatically synchronized across all 
members 
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Ad-hoc Search Management 


e Ad-hoc search results are not replicated 


- Artifact proxying is used to access the ad-hoc search results from any 
member 


—|f the search is accessed from a different member, artifact proxying calls 
the owner member to get the results 


e To reduce captain's work load, disable running scheduled searches 
on the captain 
captain_is_adhoc_searchhead = true (on all members) 


e To configure a member to run only ad-hoc searches, set 
adhoc searchhead = true 
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Alerts 


e When results of search meet alerting criteria: 


- The alerts are checked and fired locally on the member that ran the 
search job 


— The local alert information is reported to the captain 


e Captain merges and maintains global view of alerts 
—Centralizes suppression information 
» Remember, captain is the only scheduler and it delegates jobs 
- Merged alerts and suppression information are sent to all members 
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Handling Summarization 


e The captain coordinates the report and 





[indexAndForward] 
data model acceleration searches ee erate 
— The resulting summaries are stored on 
indexers Ernout. 
l derauleGroup — derault- auco- group 
e [he summary indexes are stored only on the forwardedindex.filter.disable = true 
search head that generates them indexAndForward = false 
- If you want to share them with other Pe eee ee eee 
members, forward your summary indexes server=idxl:9997, idx2:9997, idx3:9997, 
to the indexing layer idx4:9997 
BEST PRACTICE: E BEES 


outputs.conf 


s To consolidate index data, forward all 
indexes (including Summary and internal 
indexes) to the indexing layer 
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Artifact Reaping 


e Reaping (deletion of search results) happens when artifact TTL 
expires 


e Original member reaps its search artifacts and notifies captain 
e Captain orchestrates reaping of the replicas 


e |f the original member is out of commission, captain waits beyond 
TTL and replicas are reaped thereafter 
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Restarting a Search Head Cluster 


Rolling Restart | 3 
g Puts each SHC member in manual detention to complete 
in-progress searches before restarting SHC members 
A Are you sure you want to initiate a rolling restart? Doing so will cause a 


phased restart of all cluster members, with possible short-ferm inconvenience to 


rani inae > splunk rolling-restart shcluster-members 


Leam More E > splunk rolling-restart shcluster-members 


-searchable true 


Searchable Z 
Restart search head cluster members with minimal search interruption 


-decommission_wait_time 180 
-force false 


Restart search head cluster members despite unhealthy search head 


e Members restart in phases so the cluster can continue to operate 


e The captain is the final member to restart and automatically invokes captaincy transfer, 
thus preventing captaincy from changing during the restart process 


e Deployer automatically initiates a rolling restart, when necessary 


cluster. 





splunk rolling-restart shcluster-members -status 1 
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SHC Manual Detention 


e Enable manual detention to: 
- Perform maintenance operations 
- Run diagnostics on a member 
- Do a searchable rolling restart or rolling upgrade 
e Gracefully remove a SHC member from services (detention state) with minimal 
impact on end-user search experience 
- Completes in-progress searches 
- New searches get directed to remaining members 
— Continues to participate in election and conf/bundle replication 
- Does not participate in artifact replication 


e How to use: 
- Splunk edit shcluster-config -manual detention [on|off] 
- Splunk show shcluster-status 
- Splunk list shcluster-member-info 
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search Head Cluster Log Channels 


e Heartbeat and election 
— splunkd_access.log: uri_path="/services/shcluster/member/consensus*" 
- Corresponding events in sourcetype=splunkd 
> Component=SHCRaftConsensus OR component=SHClusterMgr 
> Component=Metrics group=captainstability upgrades_to_captain=1 
e Job scheduling 
- scheduler.log: sourcetype=scheduler component=SavedSplunker status=* 
- Corresponding events in sourcetype=splunkd 
>» component=SHCMaster delegate 
> Component=Metrics group=search_concurrency 
e Artifact proxy and reaping 
- sourcetype=splunkd_access uri_path="/services/search/jobs*" 
> Status! =20* indicates an issue 
>» method=GET isProxyRequest=true indicates a proxied request 
- sourcetype=splunkd_access uri_path="/services/shcluster/captain/artifacts*" 
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Useful SHC Debugging Searches 


e Election history: 

- index=_internal host IN (sh2, sh3, sh4) sourcetype=splunkd 
component=SHCRaftConsensus "All hail leader" | stats values(event_message) by 
_time host 

e Job scheduling status: 

- index=_internal host IN (sh2, sh3, sh4) sourcetype=scheduler status=* | eval 
status _host=status."-".host | timechart span=5m count by status_host limit=@ 
usenull=f 

e Skipping jobs: 

- index=_internal host IN (sh2, sh3, sh4) sourcetype=scheduler 
status=continued OR status=skipped | eval status_host=status."-".host | 
timechart span=5m count by status host limit=0 usenull=f 

e Artifact proxy: 


- index=_internal host IN (sh2, sh3, sh4) sourcetype=splunkd_access 
uri_path="/services/search/jobs*" isProxyRequest=true | stats count by method 
host file 
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Further Reading: Search Head Cluster 


e Basic search head clustering concepts 
http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/SHCarchitecture 
e Migrating a search head pooling environment 


http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/Migratefromsearchheadpooling 
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Lab Exercise 6 — Deploy a Search Head Cluster 


e Time: 30 - 35 minutes 


e Tasks: 
- Add two more search heads to site2 
- Enable a search head cluster with site2 search heads 
- Verify that your search head cluster is functioning 
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Lab Exercise 6 — Deploy a Search Head Cluster (cont. 1 


Indexer Cluster (10.0.x.1) 


cmanager 
8089 


p | dserver 
SSH you@PubLic_DNS =) s 








SSH you@10.@.x.2 





Ge eee ee ee ee ee ee ee 


Misc-Server 
(10.0.x.3) 


X = Your student ID 
8?89 = splunkd-port 
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Lab Exercise 6 — Deploy a Search Head Cluster (cont. 1 


Indexer Cluster 







http: //{Public_DNS}/{splunk_server} 
For example: 

http://{Public DNS}/sh3 
http://{Public_ DNS}/sh4 


Your 
Browser 





Public_IP = Same as your Misc-Server 
splunk_server = Splunk server name 
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splunk turn data into doing” 


Module /: 
SHC Management and 
Administration 


Generated for Chng Wei Min (wchng@micron.com) (C) Splunk Inc, not for distribution 





Splunk Cluster Administration 


174 
Copyright © 2022 Splunk, Inc. All rights reserved | 25 February 2022 


Module Objectives 


e Deploy apps to a search head cluster 

e Describe when and how to transfer captaincy 

e Upgrade and manage search head cluster members 
e Monitor SHC health with Monitoring Console 
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Configuration Bundle Replication 


e All SHC members share the same configuration baseline 


e Captain functions as a source control server 
- Members replicate changes from/to captain every 5 seconds (replication cycle) 
- Members generate a snapshot every minute and purge old sets every hour 
>» var/run/splunk/snapshot/*.bundle 


e All members SHOULD achieve eventual consistency at some point 


- In each replication cycle, members first pull (sync) outstanding changes from the 
captain, then push (submit) local changes to the captain 


- Members resolve conflicts on pull (sync) 

- All members keep a journal of changes in etc/system/replication/ops.json 
» Members check if the diverging point is at the end of captain's log 
» Pull all changes after the diverging point and then insert the changes 
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Splunk Deployer 


pooner e nnn no -- $= === - 5 - == - +--+ - 5 - == - 5 ---- --------------------, 


e Use the deployer to distribute apps and non-replicable 
files into a SHC = 


- Does not represent a "single source of truth" 





ese Se SSeS SS SSS Se SSS Se See Se See ee eee 


pon n enn nnn $$$ $= $= $= $= - === - = - = - = - == - = --------------, 


~ Cannot use it alone to restore members to the latest state 


e Must run on a non-search head member ETT | 


a 


- Can be enabled on an existing Splunk instance with other Manager Node 


responsibilities 


Aea HLL aA ea see See aod et eee ees eee 


aeaa senna awn cone seen ia Ec Paap e Pdra p Paban popopo Pabor etaar po Baarn a R RGN 


e No CLI configuration support 


- Associate the deployer with a SHC by setting the 
pass4SymmKey in server. conf 


Deployment Server: 





Ni----------- +--+ +--+ +--+ +--+ +--+ - +--+ +--+ +--+ 


Deployer server.conf 


e Can share a deployer with multiple SHCs if: [shclustering] 


l ass4SymmKey = <secret> 
- The clusters have exactly the same apps and configurations S 2 


and use the same secret 
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Deploying Bundles With Deployer 


e Stage apps (configuration bundles) in the deployer's etc/shcluster 
folder 
- Optionally, configure the push modes of each app 


e Deploying bundles from deployer works in two ways: 
- Push bundles out from deployer to all search head members 
splunk apply shcluster-bundle -target <member:port> 


» Can specify any member, but the deployer pushes the bundles to all 
members 


- Configure search head members to fetch and sync after a restart 


Splunk edit shcluster-config -conf_deploy_ fetch url 
https://<deployer>: 8089 


» Adds the deployer address to the existing search head members 
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Deployer Push Modes 


e The deployer deploys the app configuration to its members according to the 
push mode settings of each app 
e The push mode can apply at the system level or on a per app basis 
- Configure the settings under the [shclustering] stanza in app.conf 


» deployer_push_mode 


>» deployer_lookups_push_mode WARNING o 
DO NOT change the push mode 
e The push mode options do not apply to user bundles a a L 
sync state. 


1. Each user bundle is first sent to the captain 
2. [he captain commits it to the user's local directory 
» Treats the commits as member configuration changes 
» Only adopts the new stanzas and ignores the existing stanzas 
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Deployer Push Mode Options 


e Provides flexible ways to deploy apps from the deployer 
deployer_push_mode = merge to default | full | default_only | local only 
—merge_ to default is the default if a mode is not specified 
» Is the only behavior for the versions prior to 7.3 


Deployer: SPLUNK_HOME/etc/shcluster/apps SHC members: SPLUNK _HOME/etc/apps 









appA/ appA/ 
default merge to default ----- -a default 
Tocal loeo 


appB/ appB/ EA 
ote 
default  p---------țt------4 h default 
Bundles targeting local is sent to 
the captain. All others are sent to all 


aloo G7 appC/ members. 
lea S L ------ dota ee only ae - ----- jee default 
local local WARNING ea 


search/ search/ DO NOT use the deployer to push 
dereie doran built-in apps without first setting the 


local_only |------;--- push mode to local_only. 
i Wii ng@micron. lunk Inc, not for distribution 
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Deployer Lookup Push Mode Options 


e Prior to 7.3, the populated lookup tables in SHC get overwritten from the latest 
bundle update from the deployer by default 


- To preserve the populated lookup tables, you execute: 
splunk apply shcluster-bundle -target member -preserve-lookups true 
» Can only preserve or overwrite all lookup tables 
» Even experienced admins can make a mistake 


e The deployer_lookups_push_mode Setting Is introduced in /.3 

- Provides more granular ways to preserve or overwrite lookups 

- Allows a way to persist the setting per app 

-deployer_lookups_push_mode = preserve_lookups (default) 
» Follows the -preserve- lookups flag of the command 

—deployer_lookups_ push _ mode = always_preserve | always_overwrite 
» Ignores the -preserve-lookups flag 
» Allows the deployer_lookups_push_mode of the app to decide 
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Deployer Push Mode Precedence 


e The push mode can be set globally or locally for specific apps 
- If the push mode settings are not explicitly set under the specific app, the app follows the 


global deployer policy 
- The local takes precedence over any corresponding default settings 


- The settings under etc/shcluster/apps take Deployer 
precedence over etc/system etc 
system shcluster 
Note EA 
apps users 


The SHC members' existing default local 
configurations in local always take GD & | 


precedence over pushed configurations 


from a deployer. apo appB 
9... local © 


micron:com) (C) Splunk Inc, not for distribution 
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Controlling the Deployment Process 


e By default, splunk apply shcluster-bundle automatically triggers a rolling- 
restart aS necessary 


e To control the rolling-restart, you can apply in phases 


Splunk apply shcluster-bundle -target <member:port> 
Splunk apply shcluster-bundle -target <member:port> 





splunk rolling-restart shcluster-members 


CAUTION: Use the following process with extreme care 
e To delete all the apps on SHC members previously distributed by the deployer, 
you can push an empty bundle 
- The app must be in state=enabled (check its app.conf) 
- The deployer cannot remove the app if it is in state=disabled 


Splunk apply shcluster-bundle -target <member:port> 
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Multithreaded SHC Deployer Bundle Push 


e The deployerPushThreads allows you to concurrently send a large 
number of apps to a large number of SHC members 


e Splunk checks the setting in the shclustering stanza of 
server.conf onthe Deployer and launches the specified number of 


threads 
auto = deployer auto-tunes threads (one per member) 
depending on the number of members a eee 
returned by the captain (preferred setting) deployerPushThreads = auto 


1 = default setting 
<positive integer> = sets thread number 


server.conf 
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Working with Deployment Server 


e You CANNOT use deployment server to directly distribute apps to the 
peer nodes or SHC members 


e You can use it to distribute apps to the master-apps and shcluster 


directories 

deploymentclient.conf on Manager Node 
[deployment-client ] 

serverclass.conf serverRepositoryLocationPolicy = rejectAlways 

on Deployment Server tl repositoryLocation = SPLUNK_HOME/etc/master-apps 

[serverClass:idxc_ x] J 

stateOnClient = noop = 

restartSplunkd = false Ea deploymentclient.conf on Deployer 
[deployment-client] 

[serverClass:shc_y] 

a Se = noop serverRepositoryLocationPolicy = rejectAlways 

restartSplunkd = false repositoryLocation = SPLUNK_HOME/etc/shcluster/apps 
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Bundle Deployment Summary 


App bundle deployment Deployer 


p splunk apply shcluster-bundle 
= e Sends bundles as defined in server. conf 


e Sends to the captain first 





Search bundle deployment Bundle replication to search peers happens 


SHC Member 1 SHC Captain SHC Member 2 only from the captain 
e Sends bundles in parallel per bundle 





Peer 1 Peer 2 a Peer n-1 Peer n 
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SH Cluster Conf Replication Log Channels 


e conf. log records recent configuration replication activity 
index=_internal sourcetype=splunkd_ conf |timechart list(data.task) by host 
— Deployer-specific actions: 
>» createDeployableApps: build bundles in var/run/splunk/deploy 
> populateDeployableInfo: read bundles from var/run/splunk/deploy 
>» sendDeployableApps: push baseline apps to a search head cluster member 
- SHC member-specific actions: 
» addCommit: changes made locally on this member 
pullFrom: changes pulled from the captain to this member 
acceptPush: changes pushed from a member to this instance 
computeCommon: initial negotiation with the captain 
purgeEligible: purge in-memory and on-disk data to reduce resource usage 
installSnapshot: update to a snapshot of the latest bundle from the captain 
downloadDeployableApps: install baseline apps from the deployer on startup 


» 4 


» 4 


Vv 


» 4 


4 


{v 
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Checking SHC Bundle Deployment Status 


e Search from a search head member: 


index=_internal component=ConfDeployment data. task=*Apps 
| table host data.source data.target_label data.task data.status 


e Check the Search Head Clustering: App Deployment dashboard 
-MC > Search > Search Head Clustering 











Apps Status BCG Web 
2 apps Checksum on Search Head Cluster Members 
App Z Status $ Instance > Checksum $ 

sha 80be99dcced1796a0801a33648e006965ae8e189 
shc_base Synchronized sh3 80be99dcced1796a0a01a33e48e00e965ae8e189 
This panel shows the status of apps pushed by deployer to search head cluster members. sh2 80be99dcced1796a0a01a33e48e00e965ae8e189 
Synchronized: all members have the latest version of the app. Checksum on Search Head Cluster Deployer 


Out of Synchronization: one or more members does not have the latest version of the app. 


Click on a row to see more details. 


http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/HowcontrepoworksinSHC 
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some Useful Debugging Searches 


e Find missing baseline: 
index=_internal sourcetype=splunkd_conf 
STOP_ON MISSING LOCAL BASELINE | timechart count by host 
e Overall configuration replication behavior 
index=_internal sourcetype=splunkd_ conf pullFrom 
data.to_repo!=*skipping* | timechart count by data.to_ repo 
e Evidence of captain switching 
index=_internal sourcetype=splunkd conf pullFrom 
data. from_repo!=*skipping* | timechart count by 
data.from_repo 
e Find the destructive resync events: 


index=_internal sourcetype=splunkd_conf installSnapshot | 
timechart count by host 
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More Useful Conf Replication Log Channels 


e splunkd. log components 
- ConfReplication 
- ConfReplicationThread 
- ConfReplicationHandler 
- loader — check events during startup 


e Replication performance: metrics.log group=conf 
-wallclock_ms_total is for an action and wallclock_ms_max is for a single 
invocation 
e Network activity between the members: 
- sourcetype=splunkd_ access 
url path="/services/replication/configuration*" 
e Network activity between the members and the deployer: 
- sourcetype=splunkd_access uri_path="/services/apps/local*" OR 
uri path="/services/apps/deploy*" 
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Types of SHC Captaincy 


e SHC has two types of captaincy designed to handle specific cases 
- Dynamic captain (default) 


» If you want to reassign captaincy, you can manually transfer captaincy to a 
preferred member 


- Static captain 
» When members are unable to elect a captain 
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Controlling Captaincy 


e Why? 
- SHC Captain consumes more CPU and memory resources 


- If a system with less number of cores becomes the captain, the overall 
scheduling capacity gets lowered 


» SHC search capacity = captains_scheduler_count x #_of_members with 
scheduled search enabled (adhoc_searchhead = false) 


- Assign captaincy to members based upon geographic preference 
e How? 
- Set preferred captain=false in server.conf to exclude a member 


- SHC tries to elect a member with preferred_captain=true (default), but 
not always possible 


» For example, if no preferred members are reachable by the majority 


- When a non-preferred captain is chosen, the first available preferred captain 
member requests a captaincy transfer 
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Transferring Captaincy 


e To transfer the captaincy to a preferred member, run this command 
from any member: 


splunk transfer shcluster-captain -mgmt_uri <new_captain> 


e NOTE: 
-Splunk rolling-restart shcluster-members automatically invokes 
the captaincy transfer 
e To check the transfer status, run this from any member: 
splunk show shcluster-status 
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SHG a Static Captain 





° During a network partition failure, minority group members are 0022 LO 
elect a captain 


e Splunk admins can designate a captain for the minority group as a 
temporary workaround 


- On the new captain node (SH3): 





splunk edit shcluster-config -election false -mode captain -captain_ uri https://SH3:8089 
- On the rest of the minority group members: 
splunk edit shcluster- contig -election false -mode member „captain uri https ://SH3:8089 
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Restoring the Cluster to Dynamic = alll 





Upon re of the issue, revert to the dynamic captain mode 


1. On all majority members: Disable election and report to the static captain 
splunk edit shcluster-config -election false -mode member -captain_uri https://SH3:8089 


2. On all members: Re-enable election; change the static captain last 
splunk edit shcluster-config -election true -mgmt_uri <THIS MEMBER>:8@89 








3. On SH3: Bootstrap as the new dynamic captain for the entire SHC 


splunk bootstrap shcluster-captain -servers_ list <SEARCH HEAD MEMBER_LIST> 
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Adding Search Head Members 


e Why always use new Splunk instances? 
- SH cluster configuration consistency is derived from the journals 
- The ops. json file is fixed in size (configurable) and gets rolled 
— Non-members do not have this file 
- Members downed for a long time may not be able to find the diverging point 


e Add an offline member to the search head cluster: 
- Restart the offline member 
» Receives the set of intervening changes from the captain and should resync 
- If unable to fully recover, you can force the resync on the re-joining member: 
splunk resync shcluster-replicated-config 
http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/Handlemembertfailure 
e Add a new member or a previously decommissioned member: 


- Details on the next slide 
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Adding a New or Decommissioned Member 


1. Delete Splunk Enterprise, if exists 
2. Install and initialize the instance 


3. Join the search head cluster 
- TO announce itself to an existing SH cluster, run on the new instance: 


splunk add shcluster-member -current_member_uri 
https://<any_existing member>:8089 


- To introduce a new member to the cluster, run from any existing member: 
splunk add shcluster-member -new_member_uri 
https: //<new_member >: 8089 
e When a new member joins the cluster, it gets: 
- The deployed bundles from the deployer 
- The replicated configurations and artifacts from the captain 
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Decommissioning a SH Cluster Member 





Is the member Yes Run from any SHC member: 
downed splunk remove shcluster-member -mgmt_uri <downedSH:m_port> 
unannounced? 
i 
Yes re 
Run on the decommissioning member: Important PO 
Temporarily 


splunk remove shcluster-member 


issionina? 
decommissioning: splunk stop DO NOT just run: splunk stop. 





No 


Run on the decommissioning member: 
splunk remove shcluster-member 


splunk disable shcluster-config 
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Upgrading a Search Head Cluster 


e The process is the same for maintenance and major release upgrades 
http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/UpgradeaSHC 





Note G 


Can perform a searchable rolling upgrade All cluster members must be on the 
same version down to the 





Yes 
Upgrade ¢ All SHC members and indexer cluster nodes 


trom 7.142 must be running version 7.1 or later maintenance level. 
J No Can perform a rolling upgrade: 
1. Upgrade one member . Upgrade the rest of Upgrade the 
Upgrade Stop > Upgrade > Start the members deployer 


from 6.4+? 2. Wait until it re-joins the SH cluster . Optionally, transfer 
3. Transfer captaincy to this member captaincy back to 
(CLI only) the original captain 


No 
Rolling Upgrade Note G 


Perform a non-rolling upgrade: l 
e Must stop all members and the It is not necessary to upgrade the 
deployer before you can upgrade indexers at the same time. 
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splunk Cluster Searchable Rolling Upgrade 


e Upgrade or patch Splunk clusters with minimal or no downtime 


s All SHC members and indexer nodes must be running version 7.1 or later 
- Supports only in dynamic captaincy mode 
- Avoid admin operations while upgrading 
» e.g. bundle push, node addition 


e Be prepared to intervene in case a node 
Cannot restart 


1. Upgrade the manager node 
2. Upgrade the SH cluster 

3. Upgrade the deployer 

4. Upgrade indexer cluster peer nodes (site by site) 
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SH Cluster Searchable Rolling Upgrade 


S 


No TT 
issues and try again 






SHC health 
OK? 






Yes 


splunk [upgrade-init | upgrade-finalize] . Initialize SHC searchable Finalize the searchable 
shcluster-members > =| rolling upgrade rolling upgrade 


splunk list shcluster-member-info  §----- > | Select a member and put 
it on detention 


Confirm active historical search count = @ 
splunk edit shcluster-config Stop heii the 
-manual_detention [on | off] ry 


Splunk provides a template script, 
SPLUNK_HOME/bin/shc_upgrade.py, that you can 
modify and use. 


Note G Start and turn off the Able re-join 
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upgraded? 






Yes 
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Key SH Cluster Maintenance Commands 


> splunk help shclustering 

e Helpful CLI commands (can be run on Captain or members) 
splunk rolling-restart shcluster-members 
splunk show shcluster-status 

e Helpful CLI commands to run on the SHC members 
splunk [edit|list] shcluster-config 
splunk add shcluster-member 
splunk resync shcluster-replicated-config 
splunk clean raft 

e Helpful CLI commands to run on the SHC deployer 


splunk apply shcluster-bundle 
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search Head Cluster Health Report 


¢feature:shc_members_ overview tracks the status of SHC 
- Status turns yellow if members skip a heartbeat and red if they skip 
more times 
-replication_factor checks if enough search head cluster members 
exist to honor the artifact replications (yellow only) 


-detention turns yellow if a member is in manual detention, red if a 
member is in automatic detention 


¢feature:shc_ captain election overview tracks the quorum 
majority required to re-elect a dynamic captain (yellow only) 


¢feature:shc_ captain common base line reflects if the bundles 


are synced across members (red only) 
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Search Head Cluster Dashboards in MC 


Indexing » Search v Resource Usage v Forwarders ¥ Settings v Run a 















Activity > < Back Health Check 


, Distributed Search > < Search Head Clustering: Status and s 
TET : | All members in this cluster have a healthy heartbeat status. 
Search Head Clustering > | Configuration 
> 


Search Head Clustering: Configuration 


e) All members in this cluster share a common baseline. 
Replication 


Scheduler Activity 


KV Store > 





Search Head Clustering: Artifact 


e SHC Status and Configuration provides high-level SHC health 
- The last heartbeats should be nearly the same 
- A member heartbeat drift and frequent captain election indicate possible problems 
e Configuration Replication shows how user changes propagate among its 
members 
- Drill down to see members’ common baseline and detect unpublished changes 


e Growing backlogs in Artifact Replication Job Activity should be investigated 
e In Scheduler Activity, monitor Skip Ratio and Execution Latency 
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Troubleshooting the SHC with Monitoring Console 





Health Check 
A There are members in this cluster that do not have a healthy heartbeat status. click to see more detail. 


A There are members in this cluster that do not share a common baseline. Action may be required. click to see more details. Learn More 2 


Status » | | By default, members report a heartbeat every five 
Monnet seconds and the captain marks a member as 
Instance $ Role $ Status $ Last Heartbeat Sent to Captain $ Configuration Baseline Consistency + Number of Unpublished Changes + Artifact Count + Down after missing for 60 seconds 

sh2 Captain (4d) Up 03/12/2018 21:45:05 +0000 2/3 0 12 

sh3 Member Up 03/12/2018 21:45:03 +0000 2/3 0 12 

sh4 Member Down 03/12/2018 21:43:04 +0000 N/A N/A 0 








Health Check 


© All members in this cluster have a healthy heartbeat status. 


A There are members in this cluster that do not share a common baseline. Action may be required. click to see more details. Learn More Z 





a If a member was down for a long period of time, it 
3 Members may not be able to find a baseline and a manual 
Instance Role + Status > Last Heartbeat Sent to Captain Configuration Baseline Consistency Z Number of Unpublished Changes + Artifact Count + resync on the mem ber is requ i red 
sh2 Captain (4d) Up 03/12/2018 21:45:05 +0000 3/3 0 12 
sh3 Member Up 03/12/2018 21:59:18 +0000 3/3 0 12 
sh4 Member Up 03/12/2018 21:59:19 +0000 1/3 missing common baseline with the captain: 0 

https://10.01.2:8289 


















Health Check 


iv} All members in this cluster have a healthy heartbeat status. 


© All members in this cluster share a common baseline. 








— To restore consistency: 
3 Members 

: Gr 3 l splunk resync 
Instance $ Role $ Status $ Last Heartbeat Sent to Captain $ Configuration Baseline Consistency + Number of Unpublished Changes + Artifact Count Z 

e e 

sh2 Captain (4d) Up 03/12/2018 22:01:41 +0000 3/3 0 12 shc lu ster S replic ated J config 
sh3 Member Up 03/12/2018 22:01:44 +0000 3/3 0 12 
sh4 Member Up 03/12/2018 22:01:44 +0000 
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Using Diag Ul on Monitoring Console 


e Fully configured Monitoring Console running 
splunk 7.1+ can provide a convenient diag 
collection service 


- Access Settings > Instrumentation from a MC 
node 


- Can collect diags one at a time or multiple 
instances by role or host name 


- [he user requires the get_diag capability 
e MC node authenticates with remote nodes via: 
- Auth token: independent search peers 
— Index cluster secret: manager & peer nodes 
- Search head cluster secret: SHC members 
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New Diagnostics Bundle x 


Select instance you want to collect data from.@ 


All Roles v Filter 
v All Roles 
search heac j 
'oles 
luster mast 
indexer earch head 
cluster sla 
luster master 
search pee 
idexer, cluster slave, search peer 
cluster se h head 
idexer, cluster slave, search peer 
kv store 
shc captain idexer, cluster slave, search peer 
shc member 
idexer, cluster slave, search peer 
h2 cluster search head, search head, search peer, 
S , 
kv store, shc captain 
h3 cluster search head, search head, search peer, 
S 
kv store, shc member 
h4 cluster search head, search head, search peer, 
S 




























kv store, shc member 


cmaster cluster master, search head, search peer 
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Diagnostic Log 


e Lists diags previously created 
- Expires after 30 days 


e Available actions: 
- Download 
- Delete 
- Recreate 
» Useful to track changes over time 


Version 
6.0 


Remote diag Task 


Any instance, except a manager 
node 


Exclude lookups 
View and access a remote diag 
Manager node 


AN. ale 
© 


Select component granularity 
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Instrumentation 


Configure automated reporting settings, view collected data, export data to file, work with diagnostic files, and send data to Splunk. Learn 


More LZ 


Usage Data D 


View license usage, anonymized usage, and support usage data that has been collected (does not include browser session data). Learn More [2 


Date Range + 


Report 2018-03-14 to 2018-03-15 


Report 2018-03-13 to 2018-03-14 


Report 2018-03-12 to 2018-03-13 


Report 2018-03-11 to 2018-03-12 


Report 2018-03-11 


Report 2018-03-09 to 2018-03-10 


Report 2018-03-09 


Report 2018-03-08 


Diagnostic Log 


Actions 
View in Search: Li 
View in Search: Li 
View in Search: Li 
View in Search: Li 
View in Search: Li 
View in Search: Li 


View in Search: L 





View in Search: Li 


cense Data [7 
cense Data [4 
cense Data [7 
cense Data [4 
cense Data [7 


cense Data [7 


icense Data [7 


cense Data [7 


Time Sent ~ 


2018-03-16 03:01:21 


2018-03-15 03:01:21 


2018-03-14 03:01:21 


2018-03-13 03:01:21 


2018-03-12 03:01:21 


2018-03-11 03:01:21 


2018-03-10 03:01:21 


2018-03-09 03:01:21 


Status > 
success 
success 
success 
success 
success 
success 
success 


success 


Diagnostic files contain information about your Splunk deployment, such as configuration files and logs, to help Splunk Support diagnose and New Diag 


resolve problems. Learn More [2 


i Data + 


Diag-2018-03-16 


v  Diag-2018-03-15 


Nodes + 
10.01.3 
sh2, sh3, sh4 


sh2 


Actions 

Recreate Download Delete 
Recreate Delete 

Recreate Download Delete 
Recreate Download Delete 
Recreate Dow, lọ pd Delete 


Status + Size = 
Success 69.24 MB 
Success 190.94 MB 
Success 64.84 MB 
Success 62.61 MB 
Success 63.49 MB 
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Time Created ~ 


2018-03-16 09:51:24 


2018-03-15 17:49:06 


2018-03-15 17:49:06 


2018-03-15 17:49:06 


2018-03-15 17:49:06 
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Further Reading: 


e Search head forwarding 
httos://docs.splunk.com/Documentation/Splunk/latest/Indexer/Forwardmanagerdata 


e Distribute apps and configuration updates with deployer 
http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/PropagateSHCconfigurationchanges 


e Search head cluster status and troubleshoot issues 
http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/ViewSHCstatusinDMC 


e Use static captain to recover from loss of majority 
http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/Staticcaptain 
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Lab Exercise / -Deploy a SHC App 


e Time: 25 - 30 minutes 


e Tasks: 
- Configure the SHC members and the deployer 
- Stage and distribute apps to search head cluster members 
- Verify the app deployment 
- Complete the Monitoring Console setup on dserver 
- Review the search head cluster dashboards 
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Lab Exercise / — Deploy a SHC App (cont. ) 


Indexer Cluster (10.0.x.1) 


cmanager 
8089 


dserver | 
(Deployer): 
8189 ! 





SSH you@PubLic_DNS => 
http://{Public_DNS}/dserver + 





Ge ee ee ee ee ee ee ťX 


Misc-Server 
(10.0.x.3) 


X = Your student ID 
8?89 = splunkd-port 
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Lab Exercise / — Deploy a SHC App (cont. ) 


Indexer Cluster 










http: //{Public_DNS}/{splunk_server} 
For example: 

http://{Public DNS}/sh2 
http://{Public_DNS}/sh3 
http://{Public_ DNS}/sh4 
http://{Public_DNS}/sh5 (Optional) 


Your 
Browser 





Public_IP = Same as your Misc-Server 
splunk_server = Splunk server name 
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Module 8: 
KV Store Management in 
Splunk Clusters 
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Module Objectives 


e Enable a KV store collection under a search head cluster 
e Manage KV store collections in a search head cluster 
e Monitor and troubleshoot KV store issues in Splunk clusters 
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Overview of Splunk Lookups 


e A lookup is a Splunk data enrichment knowledge object 
- Splunk lookups are supported during search time and the parsing phase 
- Defined in props. conf and transforms. conf 
s File-based lookup is used for datasets that are small and/or change 
infrequently 
- Uses CSV files stored in the lookups directory 
- Gets replicated to search peers 
e KV store lookup is designed for large key-value collections that frequently 
change 
- Need to define collections .conf additionally 
: Can optionally configure data type enforcement and field accelerations 


- Collections live only on the search heads by default 
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CSV Lookup Challenges in Distributed Environment 


e Each search head manages and 
replicates its own CSV lookup files 
— Updates to lookup files propagate to 
search peers independently 
e Searches can fail if a bundle is not 
replicated to search peers in time 


- By default, a bundle larger than 2GB is 
not replicated to the search peers 





e In Splunk clusters, the timely replication 


= of lookup files is more critical 
L - By default, SHC conf replication rejects 
Indexer 1 Indexer2 ... Indexer n-1 Indexer n a bundle size larger than 2GB 
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Benefits of KV Store Lookup 


e Quick per-record updates 
- Performs Create-Read-Update-Delete (CRUD) operations on individual 
records using the Splunk REST API and SPL commands 
e Standardized interface for app developers 
- REST API and data type validation 


e Faster lookup replication from SHC to search peers 
- Enable KV store replication first per collection 
» Distributes a collection in CSV lookup files 
» Can perform automatic lookups on the index tier 


e Can facilitate backup and restore of KV store data 
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Enabling KV Store Collections 


e Before users can use a KV store, an admin must create a collection 


e A collection is defined in collections. conf 
- Must be placed in an app's default or local directory 
» Other attempts are ignored 
- Specify the name of the collection (stanza) and optional attributes 
» Matching lookup field, output fields, etc. 
- Enforcing data types is optional 
: If enforced, any input that does not match the type is silently dropped 
- More options discussed later 
collections.conf Example: 


[collection name] [mykv] 
enforceTypes = [true|false] enforceTypes = true 
field.<namel> = [number|string|bool|time] field.x number 


field.<name2> [number |string|boolļ|time] field.y string 
accelerated fields.<xl-name> = <json> accelerated Tields.xlZ = {"x"; 
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Populating KV store Lookups 


e Create a KV store collection stanza in collections. conf Note E 
- Enable optional attributes, if needed 


Working with KV store is discussed 
in detail in Building Splunk Apps 


e Add lookup definition for KV store a 
- Click Settings > Lookups > Lookup definitions 
> The resulting configuration is saved in transforms. conf 


e Write data to the KV store 
- Search: ... | fields id, location, type | outputlookup <lookup_name> 
- Or, use REST APIs 


curl -k -u <user>:<pw> 
https://<url:mport>/servicesNS/<user>/<app>/storage/collections/data/<lookup_ name> 


-H ‘Content-Type: application/json' 
-d '{"id": 001, "location": "CA", "type": "basic"}' 
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How KV Store Works In SH Cluster 


= — a e Within a SHC, the KV store forms its own 
cluster 
- Can use up to 50 SHC members 





SHC Consensus 


| e KV store port must be accessible from all 


Member Manoa SHC members 


an - Uses 8191 by default 
SHC Replication 





e SHC captain and KV store primary 
synchronize their member list every time 
there is a status change 









KV Store Consensus 


Secondary Secondary 
. | - Add, remove, or restart 
z S icati a3 
= D ° By default, the value of mgmt uri in 
V | [shclustering] is used for KV store 





Search Head Cluster READY . . i 
micron.com) (C nnection and replication 
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KV Store Collection Replication — Write 


SH‘ SH2 ; i i 
e [he primary receives all write 


operations and records them in its 





— journal 
TE e The secondaries copy and apply these 
MU R, ST operations asynchronously 









a ) e Writes are acknowledged when: 
Secondary replicate w{x=1} Primary a | 
x j{ok} - The majority of voting KV store nodes 
j{ok} Secondary have applied the operations 


Primary - The writes have successfully been logged 
to their respective journals 
eee a x Checkpoints in journals provide data 
consistency 
Collection READY - Provide recovery information 
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KV Store Collection Replication — Read 


e READ operations can be done on any 
member 





— = e Routed to the nearest member 
- Read from the lowest-latency member 
Memb | | 
anes ~ Most reads are done with local instance 








Secondary 


preference 
9 Secondary 
{xsl} 
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KV Store CLI Commands 


e The similar conditions that cause SHC members to end up in an inconsistent 
replication state can cause the KV store collections to be out of sync 


- Frequent status changes to the SHC member instances 
- Changing the GUID or hostname of a member 
- Depending on the condition of SHC, KV store cluster can also be in the state where it is 
impossible to sync 
e One workaround is to bootstrap SHC from scratch 
- WARNING: Proceed ONLY when it is absolutely necessary 


e Useful commands 
- Splunk show kvstore-status 
- Splunk [backup | restore] 
- Splunk clean kvstore 


- Splunk resync kvstore -source <GUID> 
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Splunk show kvstore-status 


Example ran on SH2 splunk show shcluster-status splunk show kvstore-status 


Captain: This member: ne 
dynamic_captain : 1 replicationStatus : Non-captain KV store member 
elected_captain : Fri Mar 9 00:32:52 2018 standalone : @ 
id : CE1F9057-981B-4419-BBA1-F1@A5F3FBC6A status : ready 
initialized flag : 1 


> sh Enabled KV store members: 
guid : 13E6C3B3-293D-468B-A27D-66EC@D1358D8 
min_peers_ joined flag : 1 hostAndPort : 10.0.1.2:8291 
rolling restart_flag : © 
service ready flag : 1 KV store members: 
configVersion : 3 
Members: electionDate : Mon Mar 12 21:43:19 2018 
sh2 label : sh2 electionDateSec : 1520899999 
mgmt_uri : https://10.0.1.2: hostAndPort : 10.0.1.2:8391 
mgmt_uri_alias : https://10.0.1.2: lastHeartbeat : Wed Mar 14 23:01:54 2018 
status : Up lastHeartbeatRecv : Wed Mar 14 23:01:55 2018 


sh3 label : sh3 optimeDate : Wed Mar 14 23:01:54 2018 
last_conf_replication : Wed Mar 14 43 2018 optimeDateSec : 1521068514 
mgmt_uri : https://10.0.1.2:8389 pingMs 
mgmt_uri_alias : https://10.0.1.2:8389 
status : Up uptime : 


sh4 label : sh4 10.0.1.2:8291 configVersion : 
last_conf_replication : Wed Mar 14 43 2018 hostAndPort : 10.0.1.2:8291 
mgmt_uri : https://10.0.1.2:8489 optimeDate : Wed Mar 14 23:01:54 2018 
mgmt_uri_alias : https://10.0.1.2: optimeDateSec : 1521068514 
status : Up replicationStatus : Non-captain KV store member 
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Monitoring KV Store Status with MC 


e MC provides more details on KV store status 


- In the MC General Setup page, add the KV store server role to the instance 


- Go to MC > Search > KV Store: Deployment 





KV Store Status 


Instance Physical Memory Mapped Memory Page Faults per Total Active 
— Usage (MB) $ Usage (MB) $ Operation $ Queued $ Connections $ 
sh2 186 2558 0.00 o 19 
sh3 176 2558 0.00 0 15 
sh4 169 2558 0.00 0 13 


Historical Charts 


Instances by Median Page Faults per Operation 








Lock 
(%) Z 


Click instance name for more details. Total queued is operations (readers and writers) waiting for a read or write lock to be cleared. 


Aggregation 
Median v 
Column Chart Heat Map 
1.3+ 
0.7-1.3 
0-0.7 
13:00PM 14:00PM 
Time 
Instance count 
20 21 22 23 
ana ad A hna Wai Min hanarna An AOS 
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Last Flush 
(ms) $ 


1 
1 


1 


Network 
Traffic (MB) $ 


1636.22 
1700.68 


561.55 


15:00PM 
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Uptime 
(hours) + 


142.73 
142.74 


49.30 


Replication 
Role $ 


Primary 
Secondary 


Secondary 


16:00PM 
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Backing Up and Restoring KV Store 


e To back up KV store data in a SHC: 
1. Confirm the KV store status with CLI or MC 
splunk show kvstore-status 
2. From a member with backupRestoreStatus and status in ready state, run: 


splunk backup kvstore [-archiveName <archive>] [-collectionName 
<collection>] [-appName <app>] 


» Dumps an output to: 
SPLUNK_HOME/var/1ib/splunk/kvstorebackup/kvdump <timestamp>.tar.gz 


e To restore when the majority SHC members are stale: 
1. Confirm the KV store status with CLI or MC 
splunk show kvstore-status 


2. Run the restore command: 
splunk restore kvstore -archiveName <archive>.tar.gz 
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Fixing Stale KV Store Members 


e When KV store members fail on the write operations, they might become stale 
- Run splunk show kvstore-status to determine the stale members 


e If majority members are stale, then the SHC requires a new bootstrap 
+ eee ee 
Store data authority and back up 


2. Onall SHC members: splunk stop 
splunk clean kvstore -cluster 


a. Remove the KV store cluster info |splunk clean kvstore -local 
b. Clean KV store local data splunk clean raft 





c. Reset SHC raft data splunk start 
splunk bootstrap shcluster-captain -servers_ list <sh> 
3. Bootstrap anew SHC and add splunk add shcluster-member -current_member_uri <captain> 
members back 
4. Restore the KV store data splunk show shcluster-status 
splunk show kvstore-status 
5. From the captain, sync KV store splunk restore kvstore -archiveName <kvdump> 
(if still see the stale members) splunk resync kvstore 
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Fixing Stale KV Store Members (cont.) 


e |f only a few members are stale, then resync it from one of the members 


~ Stale member: 


splunk show kvstore-status 


splunk stop 
splunk clean kvstore -local 


splunk start 


- SHC captain: 


splunk resync kvstore [-source <KVstore_source_GUID>] 


Note i 


Specify optional -source, if you 
want to use a member other than 
the captain. 
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SHC KV Store Bundle Replication to Indexers 


e To enable an automatic lookup with KV store data, you must enable 
replication in collections.conf 


- Each automatic lookup configuration Is limited to a specific host, source, 
or source type 


CS 
. e. | Outputlookup mykv 


collections.conf on all members 


[mykv] 

enforceTypes = true 
field.x = number 
field.y = string 


accelerated fields.xl2 = {"x": 1, "y": 1} 
replicate = true 

replication_dump_strategy = one file|auto 
replication_dump_maximum_file_size = <KB> 
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Notable KV Store Log Channels 


e Metrics.log has two subgroups under group=kvstore to track different activities 
- name=dump OR name=sync 
- Example: Show the sync operation duration over time 
index=_internal metrics group=kvstore | timechart max(msSyncTotal) by name 
e splunkd.log contains various components for KV store status, start/stop, sync, and 
replication activities 
- Logs only WARN/ERROR events by default 
- Can increase the log verbosity in server. conf 
[kvstore ] 
verbose = true 
— For list of KV store components, search: 
index=_internal sourcetype=splunkd component IN (kvstore*, collection* ) 


e kvstore. log contains the introspection data feeding the _introspection index 
index=_introspection sourcetype=kvstore 
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KV Store Operations Log Size 


e The journal keeps record of all operations that modify the collections 


e A larger journal can give a KV store cluster a greater tolerance for lag and 
even make the set more resilient 
e The KV store allocates the full log size the first time Splunk is started, 
regardless of its utilization 
- The operations log is 1 GB by default 
- [o adjust, edit [kvstore] oplogSize in server.conf 
» Must edit all nodes and run splunk clean kvstore -local 


e So, what is consuming all that journal space? 
- A large | outputlookup operation can generate a lot of records 


—outputlookup append=true vs outputlookup append=false 
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Monitoring KV Store with Monitoring Console 


e index=_introspection derives the historical KV store performance views In 
Monitoring Console 


e Start with the deployment-wide view and drill down to an instance level 
- Median Page Faults per Operation shows the read activity involving disk I/O 
- Excessive hit (1.3+) indicates a need for more RAM 
- Replication latency shows the operation lags between the primary and secondary 
nodes during writes 
» Long lags (30+) can indicate replication (write) issues 
» You may need to increase the operation log size 
- Together with High lock percentage (50%+) and Flushing rate (50~100%), you can 
infer that there are heavy write operations or the system is sluggish 


- Operations Log Window of KV Store Captain shows the time between the first and last 
operations in the journal 


e Operations per Minute in the KV Store: Instance dashboard shows the number of 
calls made to the KV store operations in detail 


Generated for Chng Wei Min (wchn micron.com) (C) Splunk Inc, not for distribution 


l E Splunk Cluster Administration 
sp un turn data into doing 231 | 
Copyright © 2022 Splunk, Inc. All rights reserved _ | 25 February 2022 


Further Reading: KV Store 


e About KV Store 


-docs.splunk.com/Documentation/Splunk/latest/Admin/AboutKVstore 
e Configure KV Store lookups: 
- docs.splunk.com/Documentation/Splunk/latest/Knowledge/ConftigurekKVst 
orelookups 
e KV Store backup and restore 


-docs.splunk.com/Documentation/Splunk/latest/Admin/BackupKVstore 
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Lab Exercise 8 — Add a KV Store Collection 


e Time: 25 - 30 minutes 


e Tasks: 
- Identify the current SHC captain and KV store captain 
- Verify the state of the KV store service from Monitoring Console 
- Enable the KV store collection replication 
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Module 9: 
Introduction to SmartStore 
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Module Objectives 


e List the use cases for deploying SmartStore 

e Provide a SmartStore architecture overview 

e Enable SmartStore in indexer cluster 

e Monitor SmartStore status and identify potential issues 


Generated for Chng Wei Min (wchng@micron.com) (C) Splunk Inc, not for distribution 
Splunk Cluster Administration 


S lunk turn data into doing“ 235 | | 
Copyright © 2022 Splunk, Inc. All rights reserved | 25 February 2022 


What is SmartStore and Why Use It? 


e Scale-out distributed architecture is not a good fit for growing volumes 


- As Splunk deployment and adoption matures, demand for storage outpaces 
compute resources 


- Adding more peer nodes in response to data growth Is expensive 
e To scale beyond typical indexer clustering use cases, separating storage 
from computation resources has benefits 
~ Lower cost of data retention 
— Higher reliability 
- Better scalability 
e Smartstore is a Splunk indexer capability used to separate the compute 
resources from storage 
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SmartStore Architecture Overview 


¢S3-compliant storage 


—httpos://en.wikipedia.org/wiki/Amazon_S3 
—https://github.com/splunk/s3-tests 
e Cache manager 
- Uploads warm buckets to a remote storage 
- Downloads (fetch) buckets from a remote storage on as needed basis 


- Evicts buckets from the local cache based on the bucket's age, search 
frequency, and other tunable criteria 











Cache Manager 


Local Storage Remote Storage 
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Considerations for Moving to SmartStore 


e Storage utilization is significantly outpacing other resources 


e Considerable amount of admin time is devoted to managing the 
cluster because the manager node Is always busy 


- Bucket fixups, data rebalancing, or software upgrade 


e When most searches are over recent data NOT SUPPORTED: Bi 
e NOT beneficial, if you run: es 
ata integrity 


Disabling bloom filters 


= Frequent, rare searches with long lookbacks . Custom paths such as bloomHomePath, 
SummaryHomePath, tstatsHomePath, 
—|f data is cached, no performance impact coldToFrozenDir, etc. 


e Latest on restrictions: 


http://docs.splunk.com/Documentation/Splunk/latest/Indexer/AboutSmartStore 
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Deploying SmartStore - Requirements 


e Same as any Splunk indexer requirements 
— http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Systemrequirements 
- Except, the preferred local storage type is SSD 

e Each on-prem machine must run the same Linux 3.x+ 64-bit OS 


e For clustering, the peer node configurations must be identical 
- Can configure SmartStore globally or on a per-index basis 
- The local storage on each peer must be in proportion to the eviction policy 
» Recommend to reserve enough storage for a 30-day retention 
- Replication factor and search factor must be equal 
» Only hot buckets follow traditional replication policies 
¢homePath and coldPath must point to the same partition 


e Resize maxDataSize=auto_ high volume to auto 
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smartStore Index Time Workflow 


1. Data is written to a hot bucket and gets 
replicated as normal 


2. As the bucket rolls into warm (cached), 
it is uploaded to remote storage 


3. Replication of the warm bucket stops 
4. After the warm bucket is successfully 


a > \ Cache 
uploaded to the remote storage, the SSD 


local bucket is eligible for eviction —————— | 
5. Buckets freeze directly from warm 






S3-compliant 


moos, 





moon, 


A r 
1 L 
1 


Note eal 


When the local warm bucket gets evicted, only the content of | 

the bucket is removed. The bucket directory is empty but not : 4 i 

deleted. | | G aR RRR RS a amann d > 

The thawed data is stored only locally. The cache manager l SSD o] 
does not manage it in any way. Tl roa E E 
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Deploying SmartStore - New 





1. Enable a manager for a new indexer cluster [default] 
. . remotePath = volume:s3/$ index_name 
2. Configure the SmartStore settings (no Ul) repFactor = auto 
—master-apps/_ cluster/local/indexes.conf [volume: s3] 
. . . storageType = remote 
- Specify the S3 volume and optionally an index S e ane A ren E paS 
» An index stanza needs homePath, thawedPath, and pp , 
F example 
coldPath epee . path = s3://path/to/mysplunk/buckets/ 
>» The coldPath is not used but required remote.s3.access_key = <S3 access key> 
remote.s3.secret_key = <S3 secret key> 
3. Add peer nodes remote.s3.endpoint = https://<S3 host> 
- Peers get the master-apps bundles when they join [myidx] 
homePath = $SPLUNK_DB/myidx/db 
4. Check the deployment thawedPath = $SPLUNK_DB/myidx/thaweddb 
coldPath = $SPLUNK_DB/myidx/colddb 
- When the cluster is complete, verify the connection Ran ae ee ee ad 
from a peer node: frozenTimePeriodInSecs = 
Splunk cmd splunkd rfs -- ls index:myidx indexes.conf 
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Deploying Smartstore - Migration 


e Pre-existing single-site buckets must follow the multisite policy 
- Set constrain_singlesite_buckets=false in the manager's server.conf 


e Review the current restrictions and reconfigure all nodes in the cluster to 
conform to the SmartStore requirements 
e Test the SmartStore configurations and remote connectivity 
e |f the test is successful, apply the bundle from the manager node 
- The manager kicks off the migration but it happens entirely on the peer nodes 
http://docs.splunk.com/Documentation/Splunk/latest/Indexer/MigratetoSmartStore 
e SmartStore index conversion can take a long time 
- To monitor the migration process, search on the manager: 


| rest /services/admin/cacheman/ metrics | fields splunk_server 
migration.* 
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Deploying SmartStore - Bootstrap 


e Bootstrapping fetches buckets present on a remote storage to new node(s) 
- Confirm all buckets are present before bringing down the old nodes) 


e To bootstrap, use the same remote storage settings on the new node(s) 


e [he manager node coordinates a discovery process per index 
1. The manager requests a peer to retrieve a bucket list for a given index 
2. The peer reports back with the bucket list 
3. The manager distributes a set of primaries to fetch across all available peer nodes 
4. Each peer 
» Gets its assigned buckets from the remote storage 
» Recreates bucket's metadata locally and sets it as the primary 
»  Replicates only primary's metadata to other peer nodes 
Other peers recreate buckets per metadata and report their buckets to the manager 
as replicated buckets 


S 
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Index Retention Policy with SmartStore 


e SmartStore indexes do not use the cold buckets 
- Buckets roll to frozen directly from warm 


e Data retention is managed cluster-wide 


e Attributes affecting the SmartStore data retention: 
- maxGlobalDataSizeMB (new) 
» Applies to all buckets on a per-index basis and across all peers in the cluster 
» Includes the total size on remote storage plus all warm buckets on all peer nodes to be 
uploaded to the remote storage 
» Counts only one copy of each bucket 
- frozenTimePeriodInSecs — behaves same as in non-SmartStore index rolling 
- maxGlobalDataSizeMB takes precedence over frozenTimePeriodInSecs 
- The existing maxTotalDataSizeMB and maxWarmDBCount settings are ignored 
e Freezing removes buckets from both local and remote storages 
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omartStore Index Freezing Process 


e The manager node assigns a freeze job to a peer node that has a local 
bucket copy 


- Identifies freeze candidates every 15 minutes by default 
e The peer node removes the remote bucket first and then the local copy 


- Any buckets participating in a search are forestalled from deletion for up to 5 
minutes and then are force-removed 


- Notifies the manager node when the bucket is frozen 
e The manager node instructs other peer nodes to delete their local copies 
e |f you thaw data from an archive, the thawed buckets only exist locally 
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smartStore Search Considerations 


¢Smartstore is optimized for certain characteristics of Splunk 
searches 
- Splunk searches typically occur over near-term data 
- Searches often have spatial and temporal locality 


e The searches always occur in local storage 
- The cache manager favors recently created and accessed buckets 
- [he cache manager directs fetching and eviction of buckets 
- Cached buckets searched infrequently will likely get evicted 


e Only the primary hot buckets participate in a search 


e Only the primary buckets from a designated peer node are fetched 
and accessed locally 
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Basic SmartStore Search- Time Workflow 


1. SH dispatches a job to search peers 


2. Peers soawn search processes and read 
a bucket list 


3. Each search process searches only the 
"opened" buckets 
— Not all searches require the entire content 
- Fetches bucket files on as needed basis 


- A bucket is "opened" only when it is 
available locally 


- The cache manager blocks the process 
until a bucket is "opened" 


— Hot buckets are always "opened" 


4. When the search is done, buckets are 
"Closed" and they are eligible for eviction 






ia 


Generated for Chng Wei Min (wchng@micron.com) (C) Splunk Inc, not for distribution 





splunk > turn data into doing 247 Copyright © 2022 Splunk, Inc. All rights reserved | 


S3-compliant 
) Storage 





Splunk Cluster Administration 


25 February 2022 


More SmartStore Search- Time Workflow 


e Summarization 
- Search head dispatches summary jobs to peer nodes 
- The peer generating the summary uploads its summary buckets 
- When a search needs the summary, the cache manager opens the summarized bucket 
- Ad-hoc summaries are not uploaded to the remote storage 
e |delete 
- The primaries delete the events locally 
- Cache manager syncs the delete journals in the remote storage 
e When a peer fails, SmartStore bucket fixup is basically the same as the classic 
- If still in valid state, a fixup replicates only bucket's metadata (faster recovery) 


- If not valid, the manager invalidates the current bucket list and assigns a peer to fetch 
and re-generates the list 
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Cache Size Settings 


e Buckets get evicted only when the cache manager thinks the host is facing 


space constraints server.conf 
i disk 
- Hot buckets are always local and do not get evicted aa DE 
- The warm buckets continue to reside in remote storage e T 


i l l eviction padding = 5120 
e The cache size settings apply on the peer (indexer-wide) max cache size = 0 





e The cache manager consumes all free partition space until the free space is 
less than the total padding size 


Total padding = minFreeSpace + eviction_padding (10 GB by default) 
s To limit the cache size, additionally set the max_cache_size setting 


http://docs.splunk.com/Documentation/Splunk/latest/Indexer/ConfigureSmartStorecachemanager 
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smartStore Eviction Policy 


e The eviction_policy setting in server.conf and associated parameters 
determine the eviction priority 
- 1lru (default) — evict the least recently used bucket 
» Do not change this default without consulting Splunk Support 
- Rawdata and TSIDX get evicted before metadata 


e To protect critical data from eviction, you can enhance the policy: 


-hotlist_recency_secs — the default keeps the buckets containing events 
from the last 24 hours 


-hotlist_bloom_ filter _recency_hours — the default keeps the bucket 
metadata (bloomfilter) for the last 15 days 
e Can apply globally (server.conf) or on a per-index level (indexes.conf) 
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Notable SmartStore Log Channels 


e solunkd.log 
- CacheManager, CacheManagerHandler — remote storage activities (INFO) 
- S3Client, CacheManagerEviction, StorageInterface (WARN) 


e search.log on peer nodes 
- CacheManagerHandler — the cache manager bucket operations 
- §2BucketCache — search-time bucket management (bucket open/close) 


e audit.log -bucket upload, download, eviction, and removal events 
e metrics.log — metrics for remote storage operations 
index=_internal metrics group IN(cachemgr*, spacemgr) 
e solunkd_access.log — records the cache manager calls from searches 


index=_internal sourcetype=splunkd_access 
uri_path="/services/admin/cacheman*" 
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Monitoring Console — SmartsStore 


e Per instance and deployment-wide SmartsStore activity information 
- Remote storage connectivity 
- Cache hits and misses 
- Bucket upload and download 
- Cache thrashing 
- Migration process 
- Bootstrapping progress 
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Further Reading: Splunk Smartstore — 


e About SmartStore 
http://docs.splunk.com/Documentation/Splunk/latest/Indexer/AboutSmartStore 

e How SmartStore works 
http://docs.splunk.com/Documentation/Splunk/latest/Indexer/SmartStoresearching 

e Troubleshoot SmartStore 
http://docs.splunk.com/Documentation/Splunk/latest/Indexer/TroubleshootSmartStore 
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Lab Exercise 9 — Migrate a Cluster to SmartStore 


e Time: 25 minutes 


e Tasks: 
- Verify the SmartStore connectivity from a test instance 
- Test the SmartStore configuration on a test instance 
- Upgrade the indexer cluster to meet the SmartStore prerequisites 
- Configure and deploy the SmartStore bundle to peer nodes 
- Validate the SmartStore-enabled cluster 
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Course Wrap-up 
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What's Next? 


e Splunk Certification program 
httos:/Awww.splunk.com/en_us/training/fag-training.Atml! 
- Splunk Core Certified User 
- Splunk Core Certified Power User 
- Splunk Enterprise Certified Admin 
- Splunk Enterprise Certified Architect 
- Splunk Certified Developer 
e Program information 
— httops://www.splunk.com/pdfs/training/Splunk-Certification-Candidate-Handbook.pdf 
e Exam registration 
— httos://www.splunk.com/pdfs/training/Exam-Registration- T utorial.pdt 





e |f you have further questions, send an email to: certification@splunk.com 
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YouTube: The Splunk How-To Channel 


e ln addition to our roster of training courses, check out the Splunk 
Education How-To channel: http://www.youtube.com/c/SplunkKHowlo 


e This site provides useful, short videos on a variety of Splunk topics 


Yousif Q Search 


splunk> education | 
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